This wiki article contains incomplete and informal notes about the MPS, the precursor to more formal documentation. Not confidential. Readership: MPS users and developers.
Warning: As far as I am aware, issues with unmanaged workspace have not been formally addressed in MPS documentation to date. This article is a start. Consequently this article is even more speculative and new than others in the Wiki. RHSK 2006-06-21
"Unmanaged↓" means containing references that the MPS cannot see. "Workspace" means stack, registers, and incomplete objects.
We are most familiar with an MPS mode of use where the stack and registers are managed (they are declared as a root and are ambiguously scanned), and the only unmanaged parts of the workspace are one or more incomplete objects in one or more allocation points.
(Several test executables omit the stack scanner, and rely on guarantees that the MPS will not move or collect memory that, I believe, we have not documented).
We should design guaranteed protocols↓ for simple single-threaded clients.
However, these are burdensome for complex clients (where it is hard to determine when writing client function X whether a call to client function Y will end some guaranteed period), and for multi-threaded clients.
So this article also attempts to generalize some concepts from our experience of the allocation point protocol, to define a guarded-protocol↓, and then restates the allocation point protocol in terms of those concepts.
This article has the following sections:
[Note: apart from "root" and "reachable", the terminology in this section is new (I think). RHSK 2006-06-16]
Note that the MPS does NOT assume that registers, the stack, etc, are roots. This is a little unusual. This means that "reachable" means "reachable to the collector"; the mutator may be able to access references that are not "reachable".
A reference may be:
[Note: the terminology in this section is new (I think). RHSK 2006-06-16]
Unmanaged workspace must be used with care because the MPS cannot see the references stored in it, and so cannot keep these references fresh.
Surprisingly, the MPS seems to lack the necessary protocols for using unmanaged workspace in general. The MPS currently supports only one protocol for unmanaged workspace: objects under construction in allocation points.
This section is a preliminary analysis of what protocols are needed (and appropriate) for using unmanaged workspace in general.
Using unmanaged workspace involves:
≡ a value that the MPS guarantees will remain a fresh reference, even when stored in an unmanaged cell, until the end of the guaranteed period. Permitted operations are safe and correct. (The MPS guarantees that the referent does not move or get reclaimed). [Clearly possible, but these protocols are not currently clearly defined. RHSK 2006-06-15]
A guaranteed reference may be copied into a managed cell.
≡ a value that may be a fresh reference, or may be stale. (The MPS knows which, but the mutator doesn't). A protocol defines how the value may be used.
≡ a value that is neither guaranteed nor guarded, and even the MPS does not know if it is a fresh reference or not. An unsafe value must not be copied into a managed non-ambiguous cell.
A guarded protocol defines the "guarded period", which has an "operations" phase, followed by an atomic "commit" event that may succeed or fail, followed by an "abort" phase (only if the commit failed).
"commit" succeeds ⇒ the guarded references remained fresh. The result of the operations is valid, and the guarded period is over. A successful commit may atomically change other state too, such as making a previously unmanaged object become managed.
"commit" fails ⇒ the guarded references are stale, the results are invalid, and the guarded period enters an "abort" phase.
The mutator must detect the abort phase, clear up if necessary, and then "close" the abort phase (this also closes the guarded period).
Note: although the "commit" is atomic and succeeds or fails, the mutator will not be able to both commit and test for success or failure atomically.
"false abort": the mutator may be unable to test commit's success or failure accurately: it may be required to make a conservative approximation, erring on the side of failure. This may result in a 'false abort', where the mutator believes it is in the "abort" phase of the guarded period, but actually the commit succeeded and the guarded period is over. Write your mutator abort code with great care: it must be valid and have the same semantics whether run in a real abort phase or a false abort.
A guarded protocol restricts which operations are permitted on guarded values.
[Note: It might be possible to design a protocol where guarded references may be dereferenced. If stale, the mutator would see objects in the stale world that persists until reclaim. Not currently supported. RHSK 2006-06-15]
[Note: This example attempts to define the allocation point protocol using the general concepts for unmanaged workspace and guarded protocols developed above, and extend the definition of the protocol to discuss storing guarded references in exactly-scanned managed cells. Commit is when the mutator sets I=A. MPS does not currently define abort-close: in practice the MPS code currently does it on the next mps_ap_fill, but we should probably define it as occurring earlier, when the mutator calls mps_ap_trip. RHSK 2006-06-20]
Throughout the guarded period, the new object is unmanaged. On successful commit, the guarded period ends, and if the new object is reachable it becomes managed.
Throughout the guarded period, the pointer value returned by mps_reserve is a guarded value that may be dereferenced to access the new object, and may be copied into a managed cell. On successful commit, copies in the managed world (including copies in the new object, if it is reachable) become fresh references to the new object.
Throughout the guarded period, a reference to a reachable object may be copied from a managed cell into the unmanaged world (including into the new object): it becomes a guarded value that may NOT be dereferenced. The guarded value may be copied into a managed cell. On successful commit, all copies stored in managed cells (including cells in the new object, if it is reachable) are still fresh. All other copies are unsafe.
On failed commit, for the duration of the abort phase, all guarded values remain guarded. After the abort phase is closed, all remaining copies of the guarded value become unsafe.
During its abort, the mutator must therefore remove all guarded values from managed non-ambiguous cells, and (because this may be a false abort) it must not dereference the pointer value returned by mps_reserve.
.talk.RB.2006-06-13:
need two snippets: 1= with stack being ambig scanned 2. without stack being ambig scanned: means, snippet should show parent being pinned. /* Store an ambiguous reference to the new object */ global->ambigroot = neo; [RHSK 2006-06-14: I'm not sure that unscanned reg+stack is possible!] Also: = can't use "new"; = ? (void**) is not portable. - avoid use of "race" -- it conventionally means a bad race.
.talk.RB.2006-06-14:
[Modes of MPS use snipped: now in its own wiki article. RHSK] MPS could make a promise about when it might invalidate unmanaged references. (We currently only invalidate when we flip the mutator for a trace, so each trace flip starts a new epoch, and unmanaged references are only valid during the epoch in which the unmanaged reference was created by copying some managed reference).
.think.RHSK.2006-06-14:
For 7: unscanned reg+stack, multi-threaded, moving. How do we support operations like: a->b->c->d = e->f->g->h; ? If forwarding pointers (broken hearts) didn't over-write the old objects, then the long chain would be 'safe' as long as you throw away the answer at the end, and as long as you do that before the reclaim phase of the trace. Hmm, it's easy to write a format where the broken heart doesn't obliterate the object's slots -- store the forwarding pointer solely in the object's format header, invisible to 'normal' client code. (By the way, are there any limits on the size of pad object that the MPS can request of a format pad method?) With such a format, and the current behaviour of MPS's only moving pool (amc): - after flip, unmanaged references are stale (point to stale copies of objects), but are otherwise still connected as the client expects; - after reclaim, they may be to unmapped memory. But the MPS does not currently give a guarantee that it will behave be like this. Perhaps it could. Allocation Point is two things: a. a fast inlined allocation protocol; b. a quarantine that allows unmanaged references to be safely written, and either: - safely converted into valid managed references (if commit succeeds); or - flagged as invalid (if commit fails). b. is nearly what we want for 7. We want a defined region of time and space within which we can use unmanaged references. This can either be guaranteed success, or an attempt followed by a 'did we get away with that?' check. We could work with this primitive: - load word from (ref1 + offset) and store it in (ref2 + offset). Exact stacks, manually maintained. But it's painful. (Okay for a language compiler, but not for a human.) From a pinned object, how do i pin its child? do { Read epoch Read o->child into unmanaged ref "child" } while(! mps_pin(child, epoch); We can do this already with an allocation point "ap_PinStack". Each mps_pin_t object in the PinStack is simply a managed ambiguous ref: /* object o MUST be pinned */ int pin_child(object *child_o, object *o) { void *p; mps_pin_t *pin; do { void *child; /* unmanaged ref: we must not deref it */ if (mps_reserve(&p, ap_PinStack, sizeof(mps_pin_t)) != MPS_RES_OK) { goto fail_pin_child; } pin = p; child = o->child; /* We must not dereference child; We must not store child anywhere managed. We must not store child in any quarantine except pin. We may store child in the pin quarantine. We may store child somewhere where it will always remain unmanaged. We may do whatever else we like with child! (Not much!) */ pin->thing = child; } while (! mps_commit(ap_PinStack, p, sizeof(mps_pin_t)); /* Success: we managed to pin the child */ child_o = pin->thing; return TRUE; fail_pin_child: child_o = NULL; return FALSE; /* out of memory, etc */ } Given pinned object o, can we pin its child simply by telling our own format code that the o->child slot is to be RankAMBIG now? NO! We may have finished scanning at RankAMBIG, be scanning at RankEXACT now, come across another ref to child, and move it.
2006-06-21 RHSK Created. 2006-06-21 RHSK Moved here from modes of use article; add Intro.
This document is copyright © 2006 Ravenbrook Limited. All rights reserved. This is an open source license. Contact Ravenbrook for commercial licensing options.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement, are disclaimed. In no event shall the copyright holders and contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.