MPS Shield
This document contains a guide to the MPS Shield. There is no historical initial design, but in its place there are some early ideas and discussions: initial ideas. References, History, Copyright and License are at the end.
Guide
Readership: any MPS developer. Not confidential.
Introduction
For incremental collection, we need separate control of collector access and mutator (client) access to memory. (The collector must be able to incrementally scan objects, without the mutator being able to see them yet).
Unfortunately common OSs do not support different access levels (protection maps) for different parts of the same process.
The MPS Shield is an abstraction that does extra work to overcome this limitation, and give the rest of the MPS the illusion that we can control collector and mutator access separately.
Control of mutator access: Raise, Lower, Suspend, Resume
The MPS uses ShieldRaise and ShieldLower to forbid or permit the mutator access to object memory (that is, memory allocated by MPS):
ShieldRaise(mode) prevents the mutator accessing the memory in the specified mode (read, write, or both).
ShieldLower(mode) allows the mutator to access the memory in the specified mode (read, write, or both).
If the mutator attempts an access that hits a shield, the MPS gets a barrier hit (aka fault, interrupt, exception, etc), quickly does some necessary work, and then makes the access succeed.
Some objects (eg. registers) cannot be hardware protected: the only way to prevent mutator access to them is to halt all mutator threads. The MPS uses ShieldSuspend and ShieldResume to do this.
Control of collector access: Enter, Leave, Expose, Cover
When the collector wants to access object memory (that is, memory allocated by MPS), it must call ShieldEnter, then wrap any accesses with a ShieldExpose / ShieldCover pair, and finally call ShieldLeave.
ShieldEnter and ShieldLeave are done by ArenaEnter and ArenaLeave (approximately) -- so the shield is always entered when we are within MPS code (approximately).
ShieldExpose might for example be called around:
- format-scan (when Scanning);
- format-skip (when marking grains in a non-moving Fix);
- format-isMoved, AddrCopy (aka memcpy), and format-move (during a copying Fix);
- format-pad (during reclaim).
Note that there is no need to call ShieldExpose when accessing pool management memory such as bit tables etc. This is not object memory, is never (legally) accessed by the mutator, and so is never shielded.
On common OSs, the only way to allow collector access is to allow access from the whole process, including the mutator. So if the shield is asked to allow collector access but deny mutator access, it will halt all mutator threads to prevent any mutator access. The MPS Shield performs suspension and restart; normal collector code does not need to worry about it.
Collector code can make multiple sequential, overlapping, or nested calls to ShieldExpose on the same segment. (Each must be balanced by a ShieldCover before ShieldLeave is called). A usage count is maintained in seg->depth: a positive "depth" means a positive number of outstanding reasons why the segment must be exposed to the collector. When the usage count reaches zero, there is no longer any reason the segment should be unprotected, and the Shield could re-instate hardware protection.
However, as a performance-improving hysteresis, the Shield defers re-protection, maintaining a cache of the last 16 times a segment no longer had a reason to be collector-accessible. Presence in the cache counts as a 'reason': segments in the cache have seg->depth increased by one. As segments get pushed out of the cache, or at ShieldLeave, this artificial 'reason' is decremented from seg->depth, and (if depth is now zero) the deferred reinstatement of hardware protection happens.
So whenever hardware protection is temporarily removed to allow collector access, there is a 'nurse' that will ensure this protection is re-established: the nurse is either the balancing ShieldCover call in collector code, or an entry in the shield cache.
[Why is there a fixed-size cache? This is not the simple approach! All we need is a chain of segs that might need their hardware protection to be sync'd with their shield mode. Head in the shield, and one pointer in each seg struct. I guess we try hard to avoid bloating SegStruct (to maintain residency in the processor cache). But is 16 the right size? A cache-miss wastes two kernel calls. RHSK 2006-12-19]
[Also, I don't like the cache code. For example, why does ShieldFlush break out early if arena->shDepth is 0? This should never happen until the cache is completely flushed, ie. we have reached shCacheLimit. Why does ShieldFlush not reset shCacheLimit? Why does flush() silently accept NULL cache entries? RHSK 2006-12-19]
[Also, why is seg->depth never checked for overflow? It is only a 4-bit-wide bit field, currently. RHSK 2006-12-19]
Initial Ideas
There never was an initial design document.
See this very helpful overview of protected tracing ∅. Note that this overview is old, and the details may be inaccurate.
See idea.shield ∅.
A. References
B. Document History
2006-12-19 RHSK Created: Guide, plus links to initial ideas. 2007-01-04 RHSK Minor text changes for clarity. 2007-01-12 RHSK ShieldEnter/Leave done by ArenaEnter/Leave
C. Copyright and License
This document is copyright © 2006-2007 Ravenbrook Limited. All rights reserved. This is an open source license. Contact Ravenbrook for commercial licensing options.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
- Redistributions in any form must be accompanied by information on how to obtain complete source code for the this software and any accompanying software that uses this software. The source code must either be included in the distribution or be available for no more than the cost of distribution plus a nominal fee, and must be freely redistributable under reasonable conditions. For an executable file, complete source code means the source code for all modules it contains. It does not include source code for modules or files that typically accompany the major components of the operating system on which the executable file runs.
This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement, are disclaimed. In no event shall the copyright holders and contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.