.. mode: -*- rst -*- Segment data structure ====================== :Tag: design.mps.seg :Author: David Jones :Date: 1997-04-03 :Status: incomplete design :Revision: $Id: //info.ravenbrook.com/project/mps/save-errno-win32/design/seg.txt#1 $ :Copyright: See `Copyright and License`_. :Index terms: pair: segments; design Introduction ------------ _`.intro`: This is the design of the segment data structure. Overview -------- _`.over.segments`: Segments are the basic units of tracing and shielding. The MPM also uses them as units of scanning and colour, although pool classes may subdivide segments and be able to maintain colour on a finer grain (down to the object level, for example). _`.over.objects`: The mutator's objects are stored in segments. Segments are contiguous blocks of memory managed by some pool. _`.segments.pool`: The arrangement of objects within a segment is determined by the class of the pool which owns the segment. The pool is associated with the segment indirectly via the first tract of the segment. _`.over.memory`: The relationship between segments and areas of memory is maintained by the segment module. Pools acquire tracts from the arena, and release them back to the arena when they don't need them any longer. The segment module can associate contiguous tracts owned by the same pool with a segment. The segment module provides the methods SegBase, SegLimit, and SegSize which map a segment onto the addresses of the memory block it represents. _`.over.hierarchy`: The Segment datastructure is designed to be subclassable (see design.mps.protocol_). The basic segment class (``Seg``) supports colour and protection for use by the tracer, as well as support for a pool ring, and all generic segment functions. Clients may use ``Seg`` directly, but will most probably want to use a subclass with additional properties. .. _design.mps.protocol: protocol _`.over.hierarchy.gcseg`: ``GCSeg`` is a subclass of ``Seg`` which implements garbage collection, including buffering and the ability to be linked onto the grey ring. It does not implement hardware barriers, and so can only be used with software barriers, for example internally in the MPS. _`.over.hierarchy.mutatorseg`: ``MutatorSeg`` is a subclass of ``GCSeg`` implementing hardware barriers. It is suitable for handing out to the mutator. Data Structure -------------- ``typedef struct SegStruct *Seg`` ``typedef struct GCSegStruct *GCSeg`` The implementations are as follows:: typedef struct SegStruct { /* segment structure */ Sig sig; /* */ SegClass class; /* segment class structure */ Tract firstTract; /* first tract of segment */ RingStruct poolRing; /* link in list of segs in pool */ Addr limit; /* limit of segment */ unsigned depth : ShieldDepthWIDTH; /* see */ AccessSet pm : AccessLIMIT; /* protection mode, */ AccessSet sm : AccessLIMIT; /* shield mode, */ TraceSet grey : TraceLIMIT; /* traces for which seg is grey */ TraceSet white : TraceLIMIT; /* traces for which seg is white */ TraceSet nailed : TraceLIMIT; /* traces for which seg has nailed objects */ RankSet rankSet : RankLIMIT; /* ranks of references in this seg */ } SegStruct; typedef struct GCSegStruct { /* GC segment structure */ SegStruct segStruct; /* superclass fields must come first */ RingStruct greyRing; /* link in list of grey segs */ RefSet summary; /* summary of references out of seg */ Buffer buffer; /* non-NULL if seg is buffered */ Sig sig; /* design.mps.sig */ } GCSegStruct; _`.field.rankSet`: The ``rankSet`` field represents the set of ranks of the references in the segment. It is initialized to empty by ``SegInit()``. _`.field.rankSet.single`: The Tracer only permits one rank per segment [ref?] so this field is either empty or a singleton. _`.field.rankSet.empty`: An empty ``rankSet`` indicates that there are no references. If there are no references in the segment then it cannot contain black or grey references. _`.field.rankSet.start`: If references are stored in the segment then it must be updated, along with the summary (`.field.summary.start`_). _`.field.depth`: The ``depth`` field is used by the Shield (impl.c.shield) to manage protection of the segment. It is initialized to zero by ``SegInit()``. _`.field.sm`: The ``sm`` field is used by the Shield (impl.c.shield) to manage protection of the segment. It is initialized to ``AccessSetEMPTY`` by ``SegInit()``. _`.field.pm`: The ``pm`` field is used by the Shield (impl.c.shield) to manage protection of the segment. It is initialized to ``AccessSetEMPTY`` by ``SegInit()``. The field is used by both the shield and the ANSI fake protection (impl.c.protan). _`.field.black`: The ``black`` field is the set of traces for which there may be black objects (that is, objects containing references, but no references to white objects) in the segment. More precisely, if there is a black object for a trace in the segment then that trace will appear in the ``black`` field. It is initialized to ``TraceSetEMPTY`` by ``SegInit()``. _`.field.grey`: The ``grey`` field is the set of traces for which there may be grey objects (i.e containing references to white objects) in the segment. More precisely, if there is a reference to a white object for a trace in the segment then that trace will appear in the ``grey`` field. It is initialized to ``TraceSetEMPTY`` by ``SegInit()``. _`.field.white`: The ``white`` field is the set of traces for which there may be white objects in the segment. More precisely, if there is a white object for a trace in the segment then that trace will appear in the ``white`` field. It is initialized to ``TraceSetEMPTY`` by ``SegInit()``. _`.field.summary`: The ``summary`` field is an approximation to the set of all references in the segment. If there is a reference ``R`` in the segment, then ``RefSetIsMember(summary, R)`` is ``TRUE``. The summary is initialized to ``RefSetEMPTY`` by ``SegInit()``. _`.field.summary.start`: If references are stored in the segment then it must be updated, along with ``rankSet`` (`.field.rankSet.start`_). _`.field.buffer`: The ``buffer`` field is either ``NULL``, or points to the descriptor structure of the buffer which is currently allocating in the segment. The field is initialized to ``NULL`` by ``SegInit()``. _`.field.buffer.owner`: This buffer must belong to the same pool as the segment, because only that pool has the right to attach it. Interface --------- Splitting and merging ..................... _`.split-and-merge`: There is support for splitting and merging segments, to give pools the flexibility to rearrange their tracts among segments as they see fit. ``Res SegSplit(Seg *segLoReturn, Seg *segHiReturn, Seg seg, Addr at)`` _`.split`: If successful, segment ``seg`` is split at address ``at``, yielding two segments which are returned in segLoReturn and segHiReturn for the low and high segments respectively. The base of the low segment is the old base of ``seg``. The limit of the low segment is ``at``. The base of the high segment is ``at``. This limit of the high segment is the old limit of ``seg``. ``seg`` is effectively destroyed during this operation (actually, it might be reused as one of the returned segments). Segment subclasses may make use of the optional arguments; the built-in classes do not. _`.split.invariants`: The client must ensure some invariants are met before calling ``SegSplit()``: - _`.split.inv.align`: ``at`` must be a multiple of the arena grain size, and lie between the base and limit of ``seg``. Justification: the split segments cannot be represented if this is not so. - _`.split.inv.buffer`: If ``seg`` is attached to a buffer, the buffered region must not include address ``at``. Justification: the segment module is not in a position to know how (or whether) a pool might wish to split a buffer. This permits the buffer to remain attached to just one of the returned segments. _`.split.state`: Except as noted above, the segments returned have the same properties as ``seg``. That is, their colour, summary, rankset, nailedness etc. are set to the values of ``seg``. ``Res SegMerge(Seg *mergedSegReturn, Seg segLo, Seg segHi)`` _`.merge`: If successful, segments ``segLo`` and ``segHi`` are merged together, yielding a segment which is returned in mergedSegReturn. ``segLo`` and ``segHi`` are effectively destroyed during this operation (actually, one of them might be reused as the merged segment). Segment subclasses may make use of the optional arguments; the built-in classes do not. _`.merge.invariants`: The client must ensure some invariants are met before calling ``SegMerge()``: - _`.merge.inv.abut`: The limit of ``segLo`` must be the same as the base of ``segHi``. Justification: the merged segment cannot be represented if this is not so. - _`.merge.inv.buffer`: One or other of ``segLo`` and ``segHi`` may be attached to a buffer, but not both. Justification: the segment module does not support attachment of a single seg to 2 buffers. - _`.merge.inv.similar`: ``segLo`` and ``segHi`` must be sufficiently similar. Two segments are sufficiently similar if they have identical values for each of the following fields: ``class``, ``grey``, ``white``, ``nailed``, ``rankSet``. Justification: There has yet to be a need to implement default behaviour for these cases. Pool classes should arrange for these values to be the same before calling ``SegMerge()``. _`.merge.state`: The merged segment will share the same state as ``segLo`` and ``segHi`` for those fields which are identical (see `.merge.inv.similar`_). The summary will be the union of the summaries of ``segLo`` and ``segHi``. Extensibility ------------- Allocation .......... ``typedef Bool (*SegBufferFillMethod)(Addr *baseReturn, Addr *limitReturn, Seg seg, Size size, RankSet rankSet)`` _`.method.buffer-fill`: Allocate a block in the segment, of at least ``size`` bytes, with the given set of ranks. If successful, update ``*baseReturn`` and ``*limitReturn`` to the block and return ``TRUE``. Otherwise, return ``FALSE``. The allocated block must be accounted as buffered (see design.mps.strategy.account.buffered_). .. _design.mps.strategy.account.buffered: strategy#.account.buffered ``typedef void (*SegBufferEmptyMethod)(Seg seg, Buffer buffer)`` _`.method.buffer-empty`: Free the unused part of the buffer to the segment. Account the used part as new (see design.mps.strategy.account.new_) and the unused part as free (see design.mps.strategy.account.free_). .. _design.mps.strategy.account.new: strategy#.account.new .. _design.mps.strategy.account.free: strategy#.account.free Garbage collection .................. ``typedef Res (*SegAccessMethod)(Seg seg, Arena arena, Addr addr, AccessSet mode, MutatorContext context)`` _`.method.access`: The ``access`` method indicates that the client program attempted to access the address ``addr``, but has been denied due to a protection fault. The ``mode`` indicates whether the client program was trying to read (``AccessREAD``) or write (``AccessWRITE``) the address. If this can't be determined, ``mode`` is ``AccessREAD | AccessWRITE``. The segment should perform any work necessary to remove the protection whilst still preserving appropriate invariants (this might scanning the region containing ``addr``). Segment classes are not required to provide this method, and not doing so indicates they never protect any memory managed by the pool. This method is called via the generic function ``SegAccess()``. ``typedef Res (*SegWhitenMethod)(Seg seg, Trace trace)`` _`.method.whiten`: The ``whiten`` method requests that the segment ``seg`` condemn (a subset of, but typically all) its objects for the trace ``trace``. That is, prepare them for participation in the trace to determine their liveness. The segment should expect fix requests (`.method.fix`_) during the trace and a reclaim request (`.method.reclaim`_) at the end of the trace. Segment classes that automatically reclaim dead objects must provide this method, and pools that use these segment classes must additionally set the ``AttrGC`` attribute. This method is called via the generic function ``SegWhiten()``. ``typedef void (*SegGreyenMethod)(Seg seg, Trace trace)`` _`.method.grey`: The ``greyen`` method requires the segment ``seg`` to colour its objects grey for the trace ``trace`` (excepting objects that were already condemned for this trace). That is, make them ready for scanning by the trace ``trace``. The segment must arrange that any appropriate invariants are preserved, possibly by using the protection interface (see design.mps.prot_). Segment classes are not required to provide this method, and not doing so indicates that all instances of this class will have no fixable or traceable references in them. This method is called via the generic function ``SegGreyen()``. .. _design.mps.prot: prot ``typedef void (*SegBlackenMethod)(Seg seg, TraceSet traceSet)`` _`.method.blacken`: The ``blacken`` method is called if it is known that the segment ``seg`` cannot refer to the white set for any of the traces in ``traceSet``. The segment must blacken all its grey objects for those traces. Segment classes are not required to provide this method, and not doing so indicates that all instances of this class will have no fixable or traceable references in them. This method is called via the generic function ``SegBlacken()``. ``typedef Res (*SegScanMethod)(Bool *totalReturn, Seg seg, ScanState ss)`` _`.method.scan`: The ``scan`` method scans all the grey objects on the segment ``seg``, passing the scan state ``ss`` to ``FormatScan``. The segment may additionally accumulate a summary of *all* its objects. If it succeeds in accumulating such a summary it must indicate that it has done so by setting the ``*totalReturn`` parameter to ``TRUE``. Otherwise it must set ``*totalReturn`` to ``FALSE``. Segment classes are not required to provide this method, and not doing so indicates that all instances of this class will have no fixable or traceable references in them. This method is called via the generic function ``SegScan()``. ``typedef Res (*SegFixMethod)(Seg seg, ScanState ss, Ref *refIO)`` _`.method.fix`: The ``fix`` method indicates that the reference ``*refIO`` has been discovered at rank ``ss->rank`` by the traces in ``ss->traces``, and the segment must handle this discovery according to the fix protocol (design.mps.fix_). If the method moves the object, it must update ``*refIO`` to refer to the new location of the object. If the method determines that the referenced object died (for example, because the highest-ranking references to the object were weak), it must update ``*refIO`` to ``NULL``. Segment classes that automatically reclaim dead objects must provide this method, and pools that use these classes must additionally set the ``AttrGC`` attribute. Pool classes that use segment classes that may move objects must also set the ``AttrMOVINGGC`` attribute. The ``fix`` method is on the critical path (see design.mps.critical-path_) and so must be fast. This method is called via the function ``TraceFix()``. .. _design.mps.fix: fix .. _design.mps.critical-path: critical-path _`.method.fixEmergency`: The ``fixEmergency`` method is used to perform fixing in "emergency" situations. Its specification is identical to the ``fix`` method, but it must complete its work without allocating memory (perhaps by using some approximation, or by running more slowly). Segment classes must provide this method if and only if they provide the ``fix`` method. If the ``fix`` method does not need to allocate memory, then it is acceptable for ``fix`` and ``fixEmergency`` to be the same. ``typedef void (*SegReclaimMethod)(Seg seg, Trace trace)`` _`.method.reclaim`: The ``reclaim`` method indicates that any remaining white objects in the segment ``seg`` have now been proved unreachable by the trace ``trace``, and so are dead. The segment should reclaim the resources associated with the dead objects. Segment classes are not required to provide this method. If they do, pools that use them must set the ``AttrGC`` attribute. This method is called via the generic function ``SegReclaim()``. ``typedef void (*SegWalkMethod)(Seg seg, Format format, FormattedObjectsVisitor f, void *v, size_t s)`` _`.method.walk`: The ``walk`` method must call the visitor function ``f`` (along with its closure parameters ``v`` and ``s`` and the format ``format``) once for each of the *black* objects in the segment ``seg``. Padding objects may or may not be included in the walk, at the segment's discretion: it is the responsibility of the client program to handle them. Forwarding objects must not be included in the walk. Segment classes need not provide this method. This method is called by the genetic function ``SegWalk()``, which is called by the heap walker ``mps_arena_formatted_objects_walk()``. ``typedef void (*SegFlipMethod)(Seg seg, Trace trace)`` _`.method.flip`: Raise the read barrier, if necessary, for a trace that's about to flip and for which the segment is grey and potentially contains references. Splitting and merging ..................... ``typedef Res (*SegSplitMethod)(Seg seg, Seg segHi, Addr base, Addr mid, Addr limit)`` _`.method.split`: Segment subclasses may extend the support for segment splitting by defining their own "split" method. On entry, ``seg`` is a segment with region ``[base,limit)``, ``segHi`` is uninitialized, ``mid`` is the address at which the segment is to be split. The method is responsible for destructively modifying ``seg`` and initializing ``segHi`` so that on exit ``seg`` is a segment with region ``[base,mid)`` and ``segHi`` is a segment with region ``[mid,limit)``. Usually a method would only directly modify the fields defined for the segment subclass. _`.method.split.next`: A split method should always call the next method, either before or after any class-specific code (see design.mps.protocol.overview.next-method_). .. _design.mps.protocol.overview.next-method: protocol#.overview.next-method _`.method.split.accounting`: If ``seg`` belongs to a generation in a chain, then the pool generation accounting must be updated. In the simple case where the split segments remain in the same generation, this can be done by calling ``PoolGenAccountForSegSplit()``. ``typedef Res (*SegMergeMethod)(Seg seg, Seg segHi, Addr base, Addr mid, Addr limit)`` _`.method.merge`: Segment subclasses may extend the support for segment merging by defining their own ``merge`` method. On entry, ``seg`` is a segment with region ``[base,mid)``, ``segHi`` is a segment with region ``[mid,limit)``, The method is responsible for destructively modifying ``seg`` and finishing ``segHi`` so that on exit ``seg`` is a segment with region ``[base,limit)`` and ``segHi`` is garbage. Usually a method would only modify the fields defined for the segment subclass. _`.method.merge.next`: A merge method should always call the next method, either before or after any class-specific code (see design.mps.protocol.overview.next-method_). .. _design.mps.protocol.overview.next-method: protocol#.overview.next-method _`.method.merge.accounting`: If ``seg`` belongs to a generation in a chain, then the pool generation accounting must be updated. In the simple case where the two segments started in the same generation and the merged segment remains in that generation, this can be done by calling ``PoolGenAccountForSegMerge()``. _`.split-merge.shield`: Split and merge methods may assume that the segments they are manipulating are not in the shield queue. _`.split-merge.shield.flush`: The shield queue is flushed before any split or merge methods are invoked. _`.split-merge.shield.re-flush`: If a split or merge method performs an operation on a segment which might cause the segment to be queued, the method must flush the shield queue before returning or calling another split or merge method. _`.split-merge.fail`: Split and merge methods might fail, in which case segments ``seg`` and ``segHi`` must be equivalently valid and configured at exit as they were according to the entry conditions. It's simplest if the failure can be detected before calling the next method (for example, by allocating any objects early in the method). _`.split-merge.fail.anti`: If it's not possible to detect failure before calling the next method, the appropriate anti-method must be used (see design.mps.protocol.guide.fail.after-next_). Split methods are anti-methods for merge methods, and vice-versa. .. _design.mps.protocol.guide.fail.after-next: protocol#.guide.fail.after-next _`.split-merge.fail.anti.constrain`: In general, care should be taken when writing split and merge methods to ensure that they really are anti-methods for each other. The anti-method must not fail if the initial method succeeded. The anti-method should reverse any side effects of the initial method, except where it's known to be safe to avoid this (see `.split-merge.fail.summary`_ for an example of a safe case). _`.split-merge.fail.anti.no`: If this isn't possible (it might not be) then the methods won't support after-next failure. This fact should be documented, if the methods are intended to support further specialization. Note that using va_arg with the ``args`` parameter is sufficient to make it impossible to reverse all side effects. _`.split-merge.fail.summary`: The segment summary might not be restored exactly after a failed merge operation. Each segment would be left with a summary which is the union of the original summaries (see `.merge.state`_). This increases the conservatism in the summaries, but is otherwise safe. _`.split-merge.unsupported`: Segment classes need not support segment merging at all. The function ``SegClassMixInNoSplitMerge()`` is supplied to set the split and merge methods to unsupporting methods that will report an error in checking varieties. Document History ---------------- - 1997-04-03 RB_ Initial draft (replacing various notes in revisions 0 and 1) was as part of editing MMsrc!seg.c(MMdevel_action2.1). - 1999-04-16 Tony Mann. Rewritten to separate segments and tracts, following `mail.tony.1998-11-02.10-26`_. .. _mail.tony.1998-11-02.10-26: https://info.ravenbrook.com/project/mps/mail/1998/11/02/10-26/0.txt - 2002-06-07 RB_ Converted from MMInfo database design document. .. _RB: https://www.ravenbrook.com/consultants/rb/ .. _GDR: https://www.ravenbrook.com/consultants/gdr/ Copyright and License --------------------- Copyright © 2001–2020 `Ravenbrook Limited `_. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.