1. Arena¶
1.2. History¶
.hist.0: Version 0 is a different document.
.hist.1: First draft written by Pekka P. Pirinen, 1997-08-11, based on design.mps.space(0) and mail.richard.1997-04-25.11-52(0).
.hist.2: Updated for separation of tracts and segments. Tony Mann, 1999-04-16.
.hist.3: Converted from MMInfo database design document. Richard Brooksby, 2002-06-07.
.hist.4: Converted to reStructuredText. Gareth Rees, 2013-03-11.
1.3. Overview¶
.overview: The arena serves two purposes. It is a structure that is the top-level state of the MPS, and as such contains a lot of fields which are considered “global”. And it provides raw memory to pools.
An arena belongs to a particular arena class. The class is selected when the arena is created. Classes encapsulate both policy (such as how pool placement preferences map into actual placement) and mechanism (such as where the memory originates: operating system virtual memory, client provided, or via malloc). Some behaviour (mostly serving the “top-level datastructure” purpose) is implemented by generic arena code, and some by arena class code.
1.4. Definitions¶
.def.tract: Pools request memory from the arena by calling
ArenaAlloc()
. This returns a block comprising a contiguous
sequence of “tracts”. A tract has a specific size (also known as the
“arena alignment”, which typically corresponds to the operating system
page size) and all tracts are aligned to that size. “Tract” is also
used for the data structure used to manage tracts.
1.5. Requirements¶
[copied from design.mps.arena.vm(1) and edited slightly – drj 1999-06-23]
[Where do these come from? Need to identify and document the sources of requirements so that they are traceable to client requirements. Most of these come from the architectural design (design.mps.architecture) or the fix function design (design.mps.fix). – richard 1995-08-28]
1.5.1. Block management¶
.req.fun.block.alloc: The arena must provide allocation of contiguous blocks of memory.
.req.fun.block.free: It must also provide freeing of contiguously allocated blocks owned by a pool - whether or not the block was allocated via a single request.
.req.attr.block.size.min: The arena must support management of blocks down to the size of the grain (page) provided by the virtual mapping interface if a virtual memory interface is being used, or a comparable size otherwise.
.req.attr.block.size.max: It must also support management of blocks up to the maximum size allowed by the combination of operating system and architecture. This is derived from req.dylan.attr.obj.max (at least).
.req.attr.block.align.min: The alignment of blocks shall not
be less than MPS_PF_ALIGN
for the architecture. This is so
that pool classes can conveniently guarantee pool allocated blocks are
aligned to MPS_PF_ALIGN
. (A trivial requirement)
.req.attr.block.grain.max: The granularity of allocation shall not be more than the grain size provided by the virtual mapping interface.
1.5.2. Address translation¶
.req.fun.trans: The arena must provide a translation from any address to either an indication that the address is not in any tract (if that is so) or the following data associated with the tract containing that address:
.req.fun.trans.pool: The pool that allocated the tract.
.req.fun.trans.arbitrary: An arbitrary pointer value that the pool can associate with the tract at any time.
.req.fun.trans.white: The tracer whiteness information. That is, a bit for each active trace that indicates whether this tract is white (contains white objects). This is required so that the “fix” protocol can run very quickly.
.req.attr.trans.time: The translation shall take no more than @@@@ [something not very large – drj 1999-06-23]
1.5.3. Iteration protocol¶
.req.iter: er, there’s a tract iteration protocol which is presumably required for some reason?
1.5.4. Arena partition¶
.req.fun.set: The arena must provide a method for approximating sets of addresses.
.req.fun.set.time: The determination of membership shall take no more than @@@@ [something very small indeed]. (the non-obvious solution is refsets)
1.5.5. Constraints¶
.req.attr.space.overhead: req.dylan.attr.space.struct implies that the arena must limit the space overhead. The arena is not the only part that introduces an overhead (pool classes being the next most obvious), so multiple parts must cooperate in order to meet the ultimate requirements.
.req.attr.time.overhead: Time overhead constraint? [how can there be a time “overhead” on a necessary component? drj 1999-06-23]
1.6. Architecture¶
1.6.1. Statics¶
.static: There is no higher-level data structure than a arena, so in order to support several arenas, we have to have some static data in impl.c.arena. See impl.c.arena.static.
.static.init: All the static data items are initialized when the first arena is created.
.static.serial: arenaSerial
is a static Serial
,
containing the serial number of the next arena to be created. The
serial of any existing arena is less than this.
.static.ring: arenaRing
is the sentinel of the ring of
arenas.
.static.ring.init: arenaRingInit
is a Bool
showing whether the ring of arenas has been initialized.
.static.ring.lock: The ring of arenas has to be locked when traversing the ring, to prevent arenas being added or removed. This is achieved by using the (non-recursive) global lock facility, provided by the lock module.
.static.check: The statics are checked each time any arena is checked.
1.6.2. Arena classes¶
.class: The Arena
data structure is designed to be
subclassable (see design.mps.protocol(0)). Clients can
select what arena class they’d like when instantiating one with
mps_arena_create()
. The arguments to
mps_arena_create()
are class dependent.
.class.init: However, the generic ArenaInit()
is
called from the class-specific method, rather than vice versa, because
the method is responsible for allocating the memory for the arena
descriptor and the arena lock in the first place. Likewise,
ArenaFinish()
is called from the finish method.
.class.fields: The alignment
(for tract allocations) and
zoneShift
(for computing zone sizes and what zone an address is
in) fields in the arena are the responsibility of the each class, and
are initialized by the init()
method. The responsibility for
maintaining the commitLimit
, spareCommitted
, and
spareCommitLimit
fields is shared between the (generic) arena and
the arena class. commitLimit
(see .commit-limit) is
changed by the generic arena code, but arena classes are responsible
for ensuring the semantics. For spareCommitted
and
spareCommitLimit
see .spare-committed below.
.class.abstract: The basic arena class
(AbstractArenaClass
) is abstract and must not be instantiated. It
provides little useful behaviour, and exists primarily as the root of
the tree of arena classes. Each concrete class must specialize each of
the class method fields, with the exception of the describe method
(which has a trivial implementation) and the extend()
,
retract()
and spareCommitExceeded()
methods which have
non-callable methods for the benefit of arena classes which don’t
implement these features.
.class.abstract.null: The abstract class does not provide
dummy implementations of those methods which must be overridden.
Instead each abstract method is initialized to NULL
.
1.6.3. Tracts¶
.tract: The arena allocation function ArenaAlloc()
allocates a block of memory to pools, of a size which is aligned to
the arena alignment. Each alignment unit (grain) of allocation is
represented by a tract. Tracts are the hook on which the segment
module is implemented. Pools which don’t use segments may use tracts
for associating their own data with each allocation grain.
.tract.structure: The tract structure definition looks like this:
typedef struct TractStruct { /* Tract structure */
Pool pool; /* MUST BE FIRST (design.mps.arena.tract.field.pool) */
void *p; /* pointer for use of owning pool */
Addr base; /* Base address of the tract */
TraceSet white : TRACE_MAX; /* traces for which tract is white */
unsigned int hasSeg : 1; /* does tract have a seg in p? */
} TractStruct;
.tract.field.pool: The pool field indicates to which pool the
tract has been allocated (.req.fun.trans.pool). Tracts
are only valid when they are allocated to pools. When tracts are not
allocated to pools, arena classes are free to reuse tract objects in
undefined ways. A standard technique is for arena class
implementations to internally describe the objects as a union type of
TractStruct
and some private representation, and to set the pool
field to NULL
when the tract is not allocated. The pool field must
come first so that the private representation can share a common
prefix with TractStruct
. This permits arena classes to determine
from their private representation whether such an object is allocated
or not, without requiring an extra field.
.tract.field.p: The p
field is used by pools to associate
tracts with other data (.req.fun.trans.arbitrary). It’s
used by the segment module to indicate which segment a tract belongs
to. If a pool doesn’t use segments it may use the p
field for its
own purposes. This field has the non-specific type (void *)
so
that pools can use it for any purpose.
.tract.field.hasSeg: The hasSeg
bit-field is a Boolean
which indicates whether the p
field is being used by the segment
module. If this field is TRUE
, then the value of p
is a
Seg
. hasSeg
is typed as an unsigned int
, rather than
a Bool
. This ensures that there won’t be sign conversion
problems when converting the bit-field value.
.tract.field.base: The base field contains the base address of the memory represented by the tract.
.tract.field.white: The white bit-field indicates for which
traces the tract is white (.req.fun.trans.white). This
information is also stored in the segment, but is duplicated here for
efficiency during a call to TraceFix()
(see
design.mps.trace.fix).
.tract.limit: The limit of the tract’s memory may be determined by adding the arena alignment to the base address.
.tract.iteration: Iteration over tracts is described in design.mps.arena.tract-iter(0).
.tract.if.tractofaddr: The function TractOfAddr()
finds the tract corresponding to an address in memory. (See
.req.fun.trans):
Bool TractOfAddr(Tract *tractReturn, Arena arena, Addr addr);
If addr
is an address which has been allocated to some pool, then
TractOfAddr()
returns TRUE
, and sets *tractReturn
to
the tract corresponding to that address. Otherwise, it returns
FALSE
. This function is similar to TractOfBaseAddr()
(see
design.mps.arena.tract-iter.if.contig-base) but serves a
more general purpose and is less efficient.
.tract.if.TRACT_OF_ADDR: TRACT_OF_ADDR()
is a macro
version of TractOfAddr()
. It’s provided for efficiency during
a call to TraceFix()
(see
design.mps.trace.fix.tractofaddr).
1.6.4. Control pool¶
.pool: Each arena has a “control pool”,
arena->controlPoolStruct
, which is used for allocating MPS control
data structures by calling ControlAlloc()
.
1.6.5. Polling¶
.poll: ArenaPoll()
is called “often” by other code
(for instance, on buffer fill or allocation). It is the entry point
for doing tracing work. If the polling clock exceeds a set threshold,
and we’re not already doing some tracing work (that is, insidePoll
is not set), it calls TracePoll()
on all busy traces.
.poll.size: The actual clock is arena->fillMutatorSize
.
This is because internal allocation is only significant when copy
segments are being allocated, and we don’t want to have the pause
times to shrink because of that. There is no current requirement for
the trace rate to guard against running out of memory. [Clearly it
really ought to: we have a requirement to not run out of memory (see
req.dylan.prot.fail-alloc,
req.dylan.prot.consult), and emergency tracing should not
be our only story. drj 1999-06-22] BufferEmpty
is not taken into
account, because the splinter will rarely be useable for allocation
and we are wary of the clock running backward.
.poll.clamp: Polling is disabled when the arena is “clamped”,
in which case arena->clamped
is TRUE
. Clamping the arena
prevents background tracing work, and further new garbage collections
from starting. Clamping and releasing are implemented by the
ArenaClamp()
and ArenaRelease()
methods.
.poll.park: The arena is “parked” by clamping it, then
polling until there are no active traces. This finishes all the active
collections and prevents further collection. Parking is implemented by
the ArenaPark()
method.
1.6.6. Commit limit¶
.commit-limit: The arena supports a client configurable
“commit limit” which is a limit on the total amount of committed
memory. The generic arena structure contains a field to hold the value
of the commit limit and the implementation provides two functions for
manipulating it: ArenaCommitLimit()
to read it, and
ArenaSetCommitLimit()
to set it. Actually abiding by the
contract of not committing more memory than the commit limit is left
up to the individual arena classes.
.commit-limit.err: When allocation from the arena would
otherwise succeed but cause the MPS to use more committed memory than
specified by the commit limit ArenaAlloc()
should refuse the
request and return ResCOMMIT_LIMIT
.
.commit-limit.err.multi: In the case where an
ArenaAlloc()
request cannot be fulfilled for more than one
reason including exceeding the commit limit then class implementations
should strive to return a result code other than ResCOMMIT_LIMIT
.
That is, ResCOMMIT_LIMIT
should only be returned if the only
reason for failing the ArenaAlloc()
request is that the commit
limit would be exceeded. The client documentation allows
implementations to be ambiguous with respect to which result code in
returned in such a situation however.
1.6.7. Spare committed (aka “hysteresis”)¶
.spare-committed: See mps_arena_spare_committed()
.
The generic arena structure contains two fields for the spare
committed memory fund: spareCommitted
records the total number of
spare committed bytes; spareCommitLimit
records the limit (set by
the user) on the amount of spare committed memory. spareCommitted
is modified by the arena class but its value is used by the generic
arena code. There are two uses: a getter function for this value is
provided through the MPS interface
(mps_arena_spare_commit_limit_set()
), and by the
SetSpareCommitLimit()
function to determine whether the amount
of spare committed memory needs to be reduced. spareCommitLimit
is
manipulated by generic arena code, however the associated semantics
are the responsibility of the class. It is the class’s responsibility
to ensure that it doesn’t use more spare committed bytes than the
value in spareCommitLimit
.
.spare-commit-limit: The function
ArenaSetSpareCommitLimit()
sets the spareCommitLimit
field. If the limit is set to a value lower than the amount of spare
committed memory (stored in spareCommitted
) then the class
specific function spareCommitExceeded
is called.
1.6.8. Locks¶
.lock.ring: ArenaAccess()
is called when we fault on
a barrier. The first thing it does is claim the non-recursive global
lock to protect the arena ring (see design.mps.lock(0)).
.lock.arena: After the arena ring lock is claimed,
ArenaEnter()
is called on one or more arenas. This claims the
lock for that arena. When the correct arena is identified or we run
out of arenas, the lock on the ring is released.
.lock.avoid: Deadlocking is avoided as described below:
.lock.avoid.mps: Firstly we require the MPS not to fault (that is, when any of these locks are held by a thread, that thread does not fault).
.lock.avoid.thread: Secondly, we require that in a multi-threaded system, memory fault handlers do not suspend threads (although the faulting thread will, of course, wait for the fault handler to finish).
.lock.avoid.conflict: Thirdly, we avoid conflicting deadlock between the arena and global locks by ensuring we never claim the arena lock when the recursive global lock is already held, and we never claim the binary global lock when the arena lock is held.
1.7. Implementation¶
1.7.1. Tract cache¶
.tract.cache: When tracts are allocated to pools by
ArenaAlloc()
, the first tract of the block and it’s base
address are cached in arena fields lastTract
and
lastTractBase
. The function TractOfBaseAddr()
(see
design.mps.arena.tract-iter.if.block-base(0)) checks
against these cached values and only calls the class method on a cache
miss. This optimizes for the common case where a pool allocates a
block and then iterates over all its tracts (for example, to attach
them to a segment).
.tract.uncache: When blocks of memory are freed by pools,
ArenaFree()
checks to see if the cached value for the most
recently allocated tract (see .tract.cache) is being
freed. If so, the cache is invalid, and must be reset. The
lastTract
and lastTractBase
fields are set to NULL
.
1.7.2. Control pool¶
.pool.init: The control pool is initialized by a call to
PoolInit()
during ArenaCreate()
.
.pool.ready: All the other fields in the arena are made
checkable before calling PoolInit()
, so PoolInit()
can
call ArenaCheck(arena)
. The pool itself is, of course, not
checkable, so we have a field arena->poolReady
, which is false
until after the return from PoolInit()
. ArenaCheck()
only checks the pool if poolReady
.
1.7.3. Traces¶
.trace: arena->trace[ti]
is valid if and only if
TraceSetIsMember(arena->busyTraces, ti)
.
.trace.create: Since the arena created by
ArenaCreate()
has arena->busyTraces = TraceSetEMPTY
, none
of the traces are meaningful.
.trace.invalid: Invalid traces have signature SigInvalid
,
which can be checked.
1.7.4. Polling¶
.poll.fields: There are three fields of a arena used for
polling: pollThreshold
, insidePoll
, and clamped
(see
above). pollThreshold
is the threshold for the next poll: it is
set at the end of ArenaPoll()
to the current polling time plus
ARENA_POLL_MAX
.
1.7.5. Location dependencies¶
.ld.epoch: arena->epoch
is the “current epoch”. This is
the number of ‘flips’ of traces in the arena since the arena was
created. From the mutator’s point of view locations change atomically
at flip.
.ld.history: arena->history
is an array of
ARENA_LD_LENGTH
elements of type RefSet
. These are the
summaries of moved objects since the last ARENA_LD_LENGTH
epochs.
If e
is one of these recent epochs, then
arena->history[e % ARENA_LD_LENGTH]
is a summary of (the original locations of) objects moved since epoch
e
.
.ld.prehistory: arena->prehistory
is a RefSet
summarizing the original locations of all objects ever moved. When
considering whether a really old location dependency is stale, it is
compared with this summary.
1.7.6. Roots¶
.root-ring: The arena holds a member of a ring of roots in the arena. It holds an incremental serial which is the serial of the next root.