40. Shield¶
40.1. Introduction¶
.intro: This document contains a guide to the MPS Shield. There is no historical initial design, but in its place there are some early ideas and discussions: see .ideas.
.readership: Any MPS developer. Not confidential.
40.2. Overview¶
.over: For incremental collection, we need separate control of collector access and mutator (client) access to memory. The collector must be able to incrementally scan objects, without the mutator being able to see them yet.
Unfortunately common OSs do not support different access levels (protection maps) for different parts of the same process.
The MPS Shield is an abstraction that does extra work to overcome this limitation, and give the rest of the MPS the illusion that we can control collector and mutator access separately.
40.3. Control of mutator access¶
The MPS uses ShieldRaise()
and ShieldLower()
to forbid or
permit the mutator access to object memory (that is, memory allocated
by MPS).
-
void
ShieldRaise
(Arena arena, Seg seg, AccessSet mode)¶ Prevent the mutator accessing the memory in the specified mode (
AccessREAD
,AccessWRITE
, or both).
-
void
ShieldLower
(Arena arena, Seg seg, AccessSet mode)¶ Allow the mutator to access the memory in the specified mode (
AccessREAD
,AccessWRITE
, or both).
If the mutator attempts an access that hits a shield, the MPS gets a barrier hit (in the form of a fault, interrupt, exception), quickly does some necessary work, and then makes the access succeed.
Some objects (for example registers) cannot be hardware protected: the
only way to prevent mutator access to them is to halt all mutator
threads. The MPS uses ShieldSuspend()
and ShieldResume()
to do
this.
40.4. Control of collector access¶
When the collector wants to access object memory (that is, memory
allocated by MPS), it must first call ShieldEnter()
, then wrap any
accesses with a ShieldExpose()
and ShieldCover()
pair, and
finally call ShieldLeave()
.
ShieldEnter()
and ShieldLeave()
are called by ArenaEnter()
and ArenaLeave()
(approximately) – so the shield is always
entered when we are within MPS code (approximately).
ShieldExpose()
might for example be called around:
format-scan (when scanning);
format-skip (when marking grains in a non-moving fix);
format-isMoved and
AddrCopy()
(during a copying fix);format-pad (during reclaim).
Note that there is no need to call ShieldExpose()
when accessing
pool management memory such as bit tables. This is not object
memory, is never (legally) accessed by the mutator, and so is never
shielded.
On common operating systems, the only way to allow collector access is to allow access from the whole process, including the mutator. So if the Shield is asked to allow collector access but deny mutator access, it will halt all mutator threads to prevent any mutator access. The Shield performs suspension and restart; normal collector code does not need to worry about it.
Collector code can make multiple sequential, overlapping, or nested
calls to ShieldExpose()
on the same segment, as long as each is
balanced by a corresponding ShieldCover()
before ShieldLeave()
is called). A usage count is maintained on each segment in
seg->depth
: a positive “depth” means a positive number of
outstanding reasons why the segment must be exposed to the collector.
When the usage count reaches zero, there is no longer any reason the
segment should be unprotected, and the Shield could re-instate
hardware protection.
However, as a performance-improving hysteresis, the Shield defers
re-protection, maintaining a cache of the last ShieldCacheSIZE
times a segment no longer had a reason to be collector-accessible.
Presence in the cache counts as a reason: segments in the cache have
seg->depth
increased by one. As segments get pushed out of the
cache, or at ShieldLeave()
, this artificial reason is
decremented from seg->depth
, and (if seg->depth
is now zero)
the deferred reinstatement of hardware protection happens.
So whenever hardware protection is temporarily removed to allow
collector access, there is a nurse that will ensure this protection
is re-established: the nurse is either the balancing ShieldCover()
call in collector code, or an entry in the shield cache.
Notes
Why is there a fixed-size cache? This is not the simple approach! All we need is a chain of segs that might need their hardware protection to be sync’d with their shield mode. Head in the shield, and one pointer in each seg struct. I guess we try hard to avoid bloating
SegStruct
(to maintain residency in the processor cache). But is 16 the right size? A cache-miss wastes two kernel calls.I don’t like the cache code. For example, why does
ShieldFlush()
break out early ifarena->shDepth
is 0? This should never happen until the cache is completely flushed, that is, we have reachedshCacheLimit
. Why doesShieldFlush()
not resetshCacheLimit
? Why doesflush()
silently acceptNULL
cache entries?Why is
seg->depth
never checked for overflow? It is only a 4-bit-wide bit field, currently.
Richard Kistruck, 2006-12-19.
40.5. Initial ideas¶
.ideas: There never was an initial design document, but [RB_1995-11-29] and [RB_1995-11-30] contain some initial ideas.
40.6. References¶
- RB_1995-11-29
Richard Brooksby. Harlequin. 1995-11-29. “Shield protocol for barriers”.
- RB_1995-11-30
Richard Brooksby. Harlequin. 1995-11-30. “Exegesis of Incremental Tracing”.