www.ravenbrook.com

.. index::
   pair: segments; design

.. _design-seg:


Segment data structure
======================

.. mps:prefix:: design.mps.seg


Introduction
------------

:mps:tag:`intro` This document describes the MPS Segment data structure.


Overview
--------

:mps:tag:`over.segments` Segments are the basic units of tracing and
shielding. The MPM also uses them as units of scanning and colour,
although pool classes may subdivide segments and be able to maintain
colour on a finer grain (down to the object level, for example).

:mps:tag:`over.objects` The mutator's objects are stored in segments.
Segments are contiguous blocks of memory managed by some pool.

:mps:tag:`segments.pool` The arrangement of objects within a segment is
determined by the class of the pool which owns the segment. The pool
is associated with the segment indirectly via the first tract of the
segment.

:mps:tag:`over.memory` The relationship between segments and areas of memory
is maintained by the segment module. Pools acquire tracts from the
arena, and release them back to the arena when they don't need them
any longer. The segment module can associate contiguous tracts owned
by the same pool with a segment. The segment module provides the
methods SegBase, SegLimit, and SegSize which map a segment onto the
addresses of the memory block it represents.

:mps:tag:`over.hierarchy` The Segment datastructure is designed to be
subclassable (see design.mps.protocol). The basic segment class
(:c:type:`Seg`) supports colour and protection for use by the tracer, as
well as support for a pool ring, and all generic segment functions.
Clients may use :c:type:`Seg` directly, but will most probably want to use a
subclass with additional properties.

:mps:tag:`over.hierarchy.gcseg` The segment module provides ``GCSeg`` - a
subclass of :c:type:`Seg` which has full support for GC including buffering
and the ability to be linked onto the grey ring.


Data Structure
--------------

.. c:type:: struct SegStruct *Seg
.. c:type:: struct GCSegStruct *GCSeg

The implementations are as follows::

    typedef struct SegStruct {      /* segment structure */
      Sig sig;                      /* <code/misc.h#sig> */
      SegClass class;               /* segment class structure */
      Tract firstTract;             /* first tract of segment */
      RingStruct poolRing;          /* link in list of segs in pool */
      Addr limit;                   /* limit of segment */
      unsigned depth : ShieldDepthWIDTH; /* see <code/shield.c#def.depth> */
      AccessSet pm : AccessLIMIT;   /* protection mode, <code/shield.c> */
      AccessSet sm : AccessLIMIT;   /* shield mode, <code/shield.c> */
      TraceSet grey : TraceLIMIT;   /* traces for which seg is grey */
      TraceSet white : TraceLIMIT;  /* traces for which seg is white */
      TraceSet nailed : TraceLIMIT; /* traces for which seg has nailed objects */
      RankSet rankSet : RankLIMIT;  /* ranks of references in this seg */
    } SegStruct;

    typedef struct GCSegStruct {    /* GC segment structure */
      SegStruct segStruct;          /* superclass fields must come first */
      RingStruct greyRing;          /* link in list of grey segs */
      RefSet summary;               /* summary of references out of seg */
      Buffer buffer;                /* non-NULL if seg is buffered */
      Sig sig;                      /* design.mps.sig */
    } GCSegStruct;


:mps:tag:`field.rankSet` The ``rankSet`` field represents the set of ranks
of the references in the segment. It is initialized to empty by
:c:func:`SegInit()`.

:mps:tag:`field.rankSet.single` The Tracer only permits one rank per segment
[ref?] so this field is either empty or a singleton.

:mps:tag:`field.rankSet.empty` An empty ``rankSet`` indicates that there are
no references. If there are no references in the segment then it
cannot contain black or grey references.

:mps:tag:`field.rankSet.start` If references are stored in the segment then
it must be updated, along with the summary (:mps:ref:`.field.summary.start`).

:mps:tag:`field.depth` The ``depth`` field is used by the Shield
(impl.c.shield) to manage protection of the segment. It is initialized
to zero by :c:func:`SegInit()`.

:mps:tag:`field.sm` The ``sm`` field is used by the Shield (impl.c.shield)
to manage protection of the segment. It is initialized to
``AccessSetEMPTY`` by :c:func:`SegInit()`.

:mps:tag:`field.pm` The ``pm`` field is used by the Shield (impl.c.shield)
to manage protection of the segment. It is initialized to
``AccessSetEMPTY`` by :c:func:`SegInit()`. The field is used by both the
shield and the ANSI fake protection (impl.c.protan).

:mps:tag:`field.black` The ``black`` field is the set of traces for which
there may be black objects (that is, objects containing references,
but no references to white objects) in the segment. More precisely, if
there is a black object for a trace in the segment then that trace
will appear in the ``black`` field. It is initialized to
``TraceSetEMPTY`` by :c:func:`SegInit()`.

:mps:tag:`field.grey` The ``grey`` field is the set of traces for which
there may be grey objects (i.e containing references to white objects)
in the segment. More precisely, if there is a reference to a white
object for a trace in the segment then that trace will appear in the
``grey`` field. It is initialized to ``TraceSetEMPTY`` by :c:func:`SegInit()`.

:mps:tag:`field.white` The ``white`` field is the set of traces for which
there may be white objects in the segment. More precisely, if there is
a white object for a trace in the segment then that trace will appear
in the ``white`` field. It is initialized to ``TraceSetEMPTY`` by
:c:func:`SegInit()`.

:mps:tag:`field.summary` The ``summary`` field is an approximation to the
set of all references in the segment. If there is a reference ``R`` in
the segment, then ``RefSetIsMember(summary, R)`` is :c:macro:`TRUE`. The
summary is initialized to ``RefSetEMPTY`` by :c:func:`SegInit()`.

:mps:tag:`field.summary.start` If references are stored in the segment then
it must be updated, along with ``rankSet`` (:mps:ref:`.field.rankSet.start`).

:mps:tag:`field.buffer` The ``buffer`` field is either :c:macro:`NULL`, or points
to the descriptor structure of the buffer which is currently
allocating in the segment. The field is initialized to :c:macro:`NULL` by
:c:func:`SegInit()`.

:mps:tag:`field.buffer.owner` This buffer must belong to the same pool as
the segment, because only that pool has the right to attach it.


Interface
---------

Splitting and merging
.....................

:mps:tag:`split-and-merge` There is support for splitting and merging
segments, to give pools the flexibility to rearrange their tracts
among segments as they see fit.

.. c:function:: Res SegSplit(Seg *segLoReturn, Seg *segHiReturn, Seg seg, Addr at, Bool withReservoirPermit, ...)

:mps:tag:`split` If successful, segment ``seg`` is split at address ``at``,
yielding two segments which are returned in segLoReturn and
segHiReturn for the low and high segments respectively. The base of
the low segment is the old base of ``seg``. The limit of the low
segment is ``at``. The base of the high segment is ``at``. This limit
of the high segment is the old limit of ``seg``. ``seg`` is
effectively destroyed during this operation (actually, it might be
reused as one of the returned segments). Segment subclasses may make
use of the optional arguments; the built-in classes do not.

:mps:tag:`split.invariants` The client must ensure some invariants are met
before calling :c:func:`SegSplit()`:

- :mps:tag:`split.inv.align` ``at`` must be a multiple of the arena grain
  size, and lie between the base and limit of ``seg``. Justification:
  the split segments cannot be represented if this is not so.

- :mps:tag:`split.inv.buffer` If ``seg`` is attached to a buffer, the
  buffered region must not include address ``at``. Justification: the
  segment module is not in a position to know how (or whether) a pool
  might wish to split a buffer. This permits the buffer to remain
  attached to just one of the returned segments.

:mps:tag:`split.state` Except as noted above, the segments returned have the
same properties as ``seg``. That is, their colour, summary, rankset,
nailedness etc. are set to the values of ``seg``.

.. c:function:: Res SegMerge(Seg *mergedSegReturn, Seg segLo, Seg segHi, Bool withReservoirPermit, ...)

:mps:tag:`merge` If successful, segments ``segLo`` and ``segHi`` are merged
together, yielding a segment which is returned in mergedSegReturn.
``segLo`` and ``segHi`` are effectively destroyed during this
operation (actually, one of them might be reused as the merged
segment). Segment subclasses may make use of the optional arguments;
the built-in classes do not.

:mps:tag:`merge.invariants` The client must ensure some invariants are met
before calling :c:func:`SegMerge()`:

- :mps:tag:`merge.inv.abut` The limit of ``segLo`` must be the same as the
  base of ``segHi``. Justification: the merged segment cannot be
  represented if this is not so.

- :mps:tag:`merge.inv.buffer` One or other of ``segLo`` and ``segHi`` may
  attached to a buffer, but not both. Justification: the segment
  module does not support attachment of a single seg to 2 buffers.

- :mps:tag:`merge.inv.similar` ``segLo`` and ``segHi`` must be sufficiently
  similar. Two segments are sufficiently similar if they have
  identical values for each of the following fields: ``class``,
  ``sm``, ``grey``, ``white``, ``nailed``, ``rankSet``. Justification:
  there is no single choice of behaviour for cases where these fields
  are not identical. The pool class must make it's own choices about
  this if it wishes to permit more flexible merging. If so, it should
  be a simple matter for the pool to arrange for the segments to look
  sufficiently similar before calling :c:func:`SegMerge()`.

:mps:tag:`merge.state` The merged segment will share the same state as
``segLo`` and ``segHi`` for those fields which are identical (see
:mps:ref:`.merge.inv.similar`). The summary will be the union of the summaries
of ``segLo`` and ``segHi``.


Extensibility
-------------

Splitting and merging
.....................

.. c:type:: Res (*SegSplitMethod)(Seg seg, Seg segHi, Addr base, Addr mid, Addr limit, Bool withReservoirPermit)

:mps:tag:`method.split` Segment subclasses may extend the support for
segment splitting by defining their own "split" method. On entry,
``seg`` is a segment with region ``[base,limit)``, ``segHi`` is
uninitialized, ``mid`` is the address at which the segment is to be
split. The method is responsible for destructively modifying ``seg``
and initializing ``segHi`` so that on exit ``seg`` is a segment with
region ``[base,mid)`` and ``segHi`` is a segment with region
``[mid,limit)``. Usually a method would only directly modify the
fields defined for the segment subclass. This might involve
allocation, which may use the reservoir if ``withReservoirPermit`` is
:c:macro:`TRUE`.

:mps:tag:`method.split.next` A split method should always call the next
method, either before or after any class-specific code (see
design.mps.protocol.overview.next-method).

.. c:type:: Res (*SegMergeMethod)(Seg seg, Seg segHi, Addr base, Addr mid, Addr limit, Bool withReservoirPermit)

:mps:tag:`method.merge` Segment subclasses may extend the support for
segment merging by defining their own ``merge`` method. On entry,
``seg`` is a segment with region ``[base,mid)``, ``segHi`` is a
segment with region ``[mid,limit)``, The method is responsible for
destructively modifying ``seg`` and finishing ``segHi`` so that on
exit ``seg`` is a segment with region ``[base,limit)`` and ``segHi``
is garbage. Usually a method would only modify the fields defined for
the segment subclass. This might involve allocation, which may use the
reservoir if ``withReservoirPermit`` is :c:macro:`TRUE`.

:mps:tag:`method.merge.next` A merge method should always call the next
method, either before or after any class-specific code (see
design.mps.protocol.overview.next-method).

:mps:tag:`split-merge.shield` Split and merge methods may assume that the
segments they are manipulating are not in the shield cache.

:mps:tag:`split-merge.shield.flush` The shield cache is flushed before any
split or merge methods are invoked.

:mps:tag:`split-merge.shield.re-flush` If a split or merge method performs
an operation on a segment which might cause the segment to be cached,
the method must flush the shield cache before returning or calling
another split or merge method.

:mps:tag:`split-merge.fail` Split and merge methods might fail, in which
case segments ``seg`` and ``segHi`` must be equivalently valid and
configured at exit as they were according to the entry conditions.
It's simplest if the failure can be detected before calling the next
method (for example, by allocating any objects early in the method).

:mps:tag:`split-merge.fail.anti` If it's not possible to detect failure
before calling the next method, the appropriate anti-method must be
used (see design.mps.protocol.guide.fail.after-next). Split methods
are anti-methods for merge methods, and vice-versa.

:mps:tag:`split-merge.fail.anti.constrain` In general, care should be taken
when writing split and merge methods to ensure that they really are
anti-methods for each other. The anti-method must not fail if the
initial method succeeded. The anti-method should reverse any side
effects of the initial method, except where it's known to be safe to
avoid this (see :mps:ref:`.split-merge.fail.summary` for an example of a safe
case).

:mps:tag:`split-merge.fail.anti.no` If this isn't possible (it might not be)
then the methods won't support after-next failure. This fact should be
documented, if the methods are intended to support further
specialization. Note that using va_arg with the ``args`` parameter is
sufficient to make it impossible to reverse all side effects.

:mps:tag:`split-merge.fail.summary` The segment summary might not be
restored exactly after a failed merge operation. Each segment would be
left with a summary which is the union of the original summaries (see
:mps:ref:`.merge.state`). This increases the conservatism in the summaries,
but is otherwise safe.

:mps:tag:`split-merge.unsupported` Segment classes need not support segment
merging at all. The function :c:func:`SegClassMixInNoSplitMerge()` is supplied
to set the split and merge methods to unsupporting methods that will
report an error in checking varieties.