MPS Wiki: Access by mutator -- a shield hit!

This wiki article contains incomplete and informal notes about the MPS, the precursor to more formal documentation. Not confidential. Readership: MPS users and developers.

Introduction

There are three times that the MPS must be informed of mutator access to memory:

grey-mutator writes to black segment (during incremental GC);
black-mutator reads from grey segment (during incremental GC);
mutator writes to seg with a summary.

( Some additional notes on why the MPS needs to know about these accesses:

Incremental GC: During an incremental garbage collection trace, the mutator (client) code runs, and may access and change ("mutate") the object graph, while the MPS is still working on GC'ing it. The mutator must not be able to see or change* parts of the object graph on the other side of the flip, so the MPS protects these parts with a 'shield'. When the mutator hits such a shield, this is called an access.

(* "see or change": Currently, the MPS only uses black-mutator tracing: the mutator must not see past the flip, so the MPS sets a read shield on grey objects. For grey-mutator tracing (not fully implemented): the mutator must not change objects past the flip, so the MPS would set a write shield on black objects.)

Summaries: The MPS maintains summaries of which zones are referenced by the objects on each segment. If the mutator changes parts of such a segment, the summary may no longer be correct, so the MPS protects summarised segments (those with a non-trivial summary) with a write-shield. When the mutator hits such a shield, this is called an access, also.

)

Every MPS memory segment has two shields -- a Read-shield and a Write-shield. The read and write shields may be raised and lowered independently. If the Read-shield (for example) is raised, and the mutator tries to read from the segment, it causes an access fault which is caught by the MPS.

(SegSM (ShieldMode) states which shields are supposed to be seen by the mutator. SegPM (ProtectionMode) is the current state of shields, which may be lower than SegSM when the MPS is accessing memory and the mutator is stopped.)

Shields are a platform-independent abstraction. They are implemented by setting operating system memory barriers. Unfortunately the mapping is not 1-to-1.

Note that the same fact (mutator access must be intercepted) is simultaneously recorded at three levels:

in the state of the data being shielded (for example, a segment that is grey for some flipped trace must be read-shielded when the mutator is running);
in the segment's shield mode (for example, SegSM = AccessREAD);
(except when mutator is stopped) in the OS protection map for the segment's memory pages (for example, PROT_NONE)

When an access is intercepted, the MPS may do some preparatory processing, but it must then perform the access that the mutator intended. There are two ways to do this:

(after the preparatory processing) drop the appropriate shields, and restart the mutator at the instruction that caused the access -- which will then succeed as normal without hitting any shield;
(after the preparatory processing) step the single mutator instruction that caused the access, do some post-processing (if required), leave the shields in place, and restart the mutator at the instruction after the one that caused the access fault. The usual way to step the instruction is to emulate it.

The MPS must choose which approach to use for each access. As often happens, there is a space/time tradeoff -- see "choosing approach" below. To summarise the choice in brief:

Dropping shields is always possible, but may require up to a whole segment of references to be scanned too early -- that is, band-promoted -- and may therefore cause object retention and delayed finalization.
Stepping allows the MPS to band-promote only the single accessed reference, and thus causes less retention. However, it is not always possible (not all instructions can be emulated), and may take more time because it leaves the shields in place, which may cause the mutator to hit the shields repeatedly.

Relationship Between Shields And Operating-System Memory Barriers

The MPS Shield abstractly assumes that the read shield and write shield are independent controllable, and that the MPS can tell precisely which shield was hit. Alas, operating systems are not so flexible:

On both Unix and Windows, the OS does not allow setting a pure read-barrier (that is: prevent reads but allow writes). Therefore the read-shield also traps writes. So if the mutator tries to write to a segment with a read-shield, it will cause an access fault: the read-shield has (unfortunately) been hit even though the MPS does not care about the mutator write instruction.
On most Unixes, the OS does not tell the signal-handler whether the faulting instruction tried to read, write, or both. So if the mutator tries to read from a segment, and the segment has both a read- and a write-shield, the MPS will not (unfortunately) be able to tell which shield was hit: it must treat it as having hit both shields.

Which Shields Were Hit? What Did The Instruction Attempt?

The MPS needs to know the following two properties:

.mustDrop: which shields were hit, or equivalently: which shields must the MPS necessarily lower in order to be sure this instruction will not fault?
.willBeAccessed: will this mutator instruction actually read, write, or both read and write, the reference?

For example, if the mutator writes to a segment with a read-shield on Unix or Windows (which do not support pure read-barriers) the values are willBeAccessed = WRITE and mustDrop = READ. The write instruction 'should not' have caused the read-shield hit, but it did because of OS limitations. The shield that must be lowered to prevent a fault is the READ shield -- counterintuitive but true.

Currently, these two values (and sometimes a choice to lower additional shields, see below) are lumped together into a single "mode" argument which gets modified and interpreted at various stages in a rather ad hoc way. This makes it very hard to know what "mode" really means. Currently, the platform-part cannot see SegSM, so it calculates a (very conservative) value without this information, and passes it in the "mode" parameter of its call to ArenaAccess(). Then the platform-independent ArenaAccess and friends, which can see SegSM, make further adjustments.

[I do not like this approach. I'd prefer it if mustDrop and willBeAccessed were explicitly calculated by the platform-part of the MPS shield code (prot*.c), using whatever data the OS passes to the signal-handler, plus SegSM for this segment. The platform-part would not have to reas SegSM: it could have several static tables from SegSM -> mustDrop x willBeAccessed, select the appropriate one depending on the data the OS passes to the signal-handler for this hit, and pass a pointer to it in the call to ArenaAccess() (in a new parameter). RHSK 2007-10-01]

[Also, what is that test of SegPM() doing in global.c? Threads may cause nasty synchronisation issues: SM and PM do not change atomically with changes to the OS barriers, so threads may cause a barrier hit when the barriers and SM & PM are not consistent. If all MPS access to mutator memory (such as during collection work) only happens when all threads are stopped, then this doesn't happen yet. RHSK 2007-09-30]

For the "drop" approach

If the MPS chooses the "drop" approach, it must calculate a conservatively-accurate "mustDrop" Shield value. Lowering these shields must remove all the barriers that may have been hit by the faulting instruction. The MPS will then restart the mutator on the instruction that originally caused the fault. If "mustDrop" does not cause all necessary OS barriers to be lowered, the instruction will hit the barrier again, and we hang in an infinite loop.

Because it is about to lower these shields, the MPS must do the corresponding preparatory processing (greyening, scanning, and summary-zapping) for the whole segment.

(The "willBeAccessed" value is not important with this approach.)

For the "step" approach

If the MPS chooses the "step" approach, we don't care about the "mustDrop" value, because we will not be lowering any shields, and we will not be restarting the faulting instruction.

But we do need to know the "willBeAccessed" value (that is: whether the stepped instruction will read, write, or both). Where "willBeAccessed" intersects SegSM, we must do the corresponding preparatory processing (greyening or scanning) for the affected reference. (With stepping, the summary can be updated at the end).

Lowering Shields That Were Not Hit

Sometimes MPS must lower a shield even though the faulting instruction only hit it because of operating system limitations. The "mustDrop" value expresses this. The common case is where a read-shield is hit by a write-instruction. It shouldn't have caused a shield-hit, but that's a limitation in many OSs. MPS must remove the read-shield.

Sometimes it might improve performance to lower a shield even though it is not part of "mustDrop": if the mutator is going to hit the shield in one or two instructions time anyway then we may as well lower it now, and avoid the overhead of two shield hits.

There should be a place in MPS access handling where this decision is made explicit, but there isn't. De facto, the platform code sometimes has this effect. For example, if there are both read- and write-shields on a segment, and the mutator tries to read, then:

On Unix platforms (where the signal handler does not seem to be able to determine whether the faulting instruction was a read or a write) "mode" is set to both READ and WRITE, so both shields will be lowered immediately.
On Windows a mutator read will set "mode" to READ, so only the read-shield will be lowered, leaving the summary (and write-shield) intact.

Choosing Approach

The factors in choosing between the "drop" and "step" approaches are:

what is the cost of dropping shields from the whole segment?
what is the cost of stepping the instruction, and keeping shields?
is it possible to step the instruction?

Cost of dropping a read shield

To drop a read shield, the MPS must scan (degrey) the segment. In most cases this has little cost, as the segment would need to be scanned eventually anyway. (Though for non-moving pools it might cause more and smaller segment scans).

But if the segment contains grey (for a flipped trace) references into a band later than the band the trace is currently on, this will cause band promotion of all grey references in the segment. A notable case is weak references: if the trace has not reached the weak band yet, then weak references cannot be splatted, and all objects weakly referenced by grey references on this segment will be preserved. They in turn will preserve more, and this can cause massive weak retention.

In contrast, choosing to step would only band-promote a single reference.

Remember that condemned segments in non-moving pools may need to be scanned several times in a single collection. Dropping the read shield once is not the end of the story: if more references on the seg are subsequently greyed, the read shield will be raised again.

Cost of dropping a write shield

To drop a write shield, the MPS must zap the segment's summary. This has little cost, so the MPS is happy to drop a write shield. (The cost of stepping would be too high to be worth it).

Cost of stepping

Stepping a single instruction requires a lot of code, but the higher cost is that the shield is left in place, and may be hit again and again.

PoolAWL can keep a count of the number of times stepping was chosen. If stepping happens too much, it can give up and just drop the shields instead.

What Happens When A Shield Is Hit

This happens:

sigHandle (prot*.c)
 ArenaAccess
  PoolAccess (pool.c)
   (pool->class->access)

For AWL, AWLAccess() ...

[ INCOMPLETE RHSK 2007-09-30 ]

B. Document History

  2007-09-27  RHSK  Created.
  2007-09-30  RHSK  Introduction.
  2007-10-01  RHSK  Which Barriers Were Hit?
  2007-10-03  RHSK  Shield != Barrier: use correct terminology throughout.
  2007-10-02  RHSK  Choosing Approach

C. Copyright and License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Redistributions in any form must be accompanied by information on how to obtain complete source code for the this software and any accompanying software that uses this software. The source code must either be included in the distribution or be available for no more than the cost of distribution plus a nominal fee, and must be freely redistributable under reasonable conditions. For an executable file, complete source code means the source code for all modules it contains. It does not include source code for modules or files that typically accompany the major components of the operating system on which the executable file runs.

This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement, are disclaimed. In no event shall the copyright holders and contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.