3. Coalescing block structure¶
3.1. Introduction¶
.intro: This is the design for impl.c.cbs, which implements a data structure for the management of non-intersecting memory ranges, with eager coalescence.
.readership: This document is intended for any MM developer.
.source: design.mps.poolmv2, design.mps.poolmvff.
.overview: The “coalescing block structure” is a set of addresses (or a subset of address space), with provision for efficient management of contiguous ranges, including insertion and deletion, high level communication with the client about the size of contiguous ranges, and detection of protocol violations.
3.2. History¶
.hist.0: This document was derived from the outline in design.mps.poolmv2(2). Written by Gavin Matthews 1998-05-01.
.hist.1: Updated by Gavin Matthews 1998-07-22 in response to approval comments in change.epcore.anchovy.160040 There is too much fragmentation in trapping memory.
.hist.2: Updated by Gavin Matthews (as part of change.epcore.brisling.160158: MVFF cannot be instantiated with 4-byte alignment) to document new alignment restrictions.
.hist.3: Converted from MMInfo database design document. Richard Brooksby, 2002-06-07.
.hist.4: Converted to reStructuredText. Gareth Rees, 2013-04-14.
3.3. Definitions¶
.def.range: A (contiguous) range of addresses is a semi-open interval on address space.
.def.isolated: A contiguous range is isolated with respect to some property it has, if adjacent elements do not have that property.
.def.interesting: A block is interesting if it is of at least the minimum interesting size specified by the client.
3.4. Requirements¶
.req.set: Must maintain a set of addresses.
.req.fast: Common operations must have a low amortized cost.
.req.add: Must be able to add address ranges to the set.
.req.remove: Must be able to remove address ranges from the set.
.req.size: Must report concisely to the client when isolated contiguous ranges of at least a certain size appear and disappear.
.req.iterate: Must support the iteration of all isolated contiguous ranges. This will not be a common operation.
.req.protocol: Must detect protocol violations.
.req.debug: Must support debugging of client code.
.req.small: Must have a small space overhead for the storage of typical subsets of address space and not have abysmal overhead for the storage of any subset of address space.
.req.align: Must support an alignment (the alignment of all
addresses specifying ranges) of down to sizeof(void *)
without
losing memory.
3.5. Interface¶
.header: CBS is used through impl.h.cbs.
3.5.1. External types¶
-
typedef struct CBSStruct CBSStruct, *CBS;
.type.cbs: CBS
is the main datastructure for
manipulating a CBS. It is intended that a CBSStruct
be
embedded in another structure. No convenience functions are provided
for the allocation or deallocation of the CBS.
-
typedef struct CBSBlockStruct CBSBlockStruct, *CBSBlock;
.type.cbs.block: CBSBlock
is the data-structure
that represents an isolated contiguous range held by the CBS. It is
returned by the new and delete methods described below.
.type.cbs.method: The following methods are provided as callbacks to advise the client of certain events. The implementation of these functions should not cause any CBS function to be called on the same CBS. In this respect, the CBS module is not re-entrant.
-
typedef void (*CBSChangeSizeMethod)(CBS cbs, CBSBlock block, Size oldSize, SizeNewSize);
.type.cbs.change.size.method: CBSChangeSizeMethod
is the function pointer type, four instances of which are optionally
registered via CBSInit.
These callbacks are invoked under CBSInsert()
,
CBSDelete()
, or CBSSetMinSize()
in certain
circumstances. Unless otherwise stated, oldSize
and newSize
will both be non-zero, and different. The accessors
CBSBlockBase()
, CBSBlockLimit()
, and
CBSBlockSize()
may be called from within these callbacks,
except within the delete callback when newSize
is zero. See
.impl.callback for implementation details.
-
typedef Bool (*CBSIterateMethod)(CBS cbs, CBSBlock block, void *closureP, unsigned long closureS);
.type.cbs.iterate.method: CBSIterateMethod
is a
function pointer type for a client method invoked by the CBS module
for every isolated contiguous range in address order, when passed to
the CBSIterate()
or CBSIterateLarge()
functions. The
function returns a boolean indicating whether to continue with the
iteration.
3.5.2. External functions¶
-
Res
CBSInit
(Arena arena, CBS cbs, CBSChangeSizeMethod new, CBSChangeSizeMethod delete, CBSChangeSizeMethod grow, CBSChangeSizeMethod shrink, Size minSize, Align alignment, Bool mayUseInline)¶
.function.cbs.init: CBSInit()
is the function that
initialises the CBS structure. It performs allocation in the supplied
arena. Four methods are passed in as function pointers (see
.type.* above), any of which may be NULL
. It receives a
minimum size, which is used when determining whether to call the
optional methods. The mayUseInline
Boolean indicates whether the
CBS may use the memory in the ranges as a low-memory fallback (see
.impl.low-mem). The alignment indicates the alignment of
ranges to be maintained. An initialised CBS contains no ranges.
.function.cbs.init.may-use-inline: If mayUseInline
is
set, then alignment
must be at least sizeof(void *)
. In this
mode, the CBS will never fail to insert or delete ranges, even if
memory for control structures becomes short. Note that, in such cases,
the CBS may defer notification of new/grow events, but will report
available blocks in CBSFindFirst()
and CBSFindLast()
.
Such low memory conditions will be rare and transitory. See
.align for more details.
-
void
CBSFinish
(CBS cbs)¶
.function.cbs.finish: CBSFinish()
is the function
that finishes the CBS structure and discards any other resources
associated with the CBS.
.function.cbs.insert: CBSInsert()
is the function
used to add a contiguous range specified by [base,limit)
to the
CBS. If any part of the range is already in the CBS, then
ResFAIL
is returned, and the CBS is unchanged. This
function may cause allocation; if this allocation fails, and any
contingency mechanism fails, then ResMEMORY
is returned,
and the CBS is unchanged.
.function.cbs.insert.callback: CBSInsert()
will invoke callbacks as follows:
new
: when a new block is created that is interesting.oldSize == 0; newSize >= minSize
.new
: when an uninteresting block coalesces to become interesting.0 < oldSize < minSize <= newSize
.delete
: when two interesting blocks are coalesced.grow
will also be invoked in this case on the larger of the two blocks.newSize == 0; oldSize >= minSize
.grow
: when an interesting block grows in size.minSize <= oldSize < newSize
.
.function.cbs.delete: CBSDelete()
is the function
used to remove a contiguous range specified by [base,limit)
from
the CBS. If any part of the range is not in the CBS, then
ResFAIL
is returned, and the CBS is unchanged. This
function may cause allocation; if this allocation fails, and any
contingency mechanism fails, then ResMEMORY
is returned,
and the CBS is unchanged.
.function.cbs.delete.callback: CBSDelete()
will
invoke callbacks as follows:
delete
: when an interesting block is entirely removed.newSize == 0; oldSize >= minSize
.delete
: when an interesting block becomes uninteresting.0 < newSize < minSize <= oldSize
.new
: when a block is split into two blocks, both of which are interesting.shrink
will also be invoked in this case on the larger of the two blocks.oldSize == 0; newSize >= minSize
.shrink
: when an interesting block shrinks in size, but remains interesting.minSize <= newSize < oldSize
.
-
void
CBSIterate
(CBS cbs, CBSIterateMethod iterate, void *closureP, unsigned long closureS)¶
.function.cbs.iterate: CBSIterate()
is the function
used to iterate all isolated contiguous ranges in a CBS. It receives a
pointer, unsigned long closure pair to pass on to the iterator method,
and an iterator method to invoke on every range in address order. If
the iterator method returns FALSE
, then the iteration is
terminated.
-
void
CBSIterateLarge
(CBS cbs, CBSIterateMethod iterate, void *closureP, unsigned long closureS)¶
.function.cbs.iterate.large: CBSIterateLarge()
is the
function used to iterate all isolated contiguous ranges of size
greater than or equal to the client indicated minimum size in a CBS.
It receives a pointer, unsigned long closure pair to pass on to the
iterator method, and an iterator method to invoke on every large range
in address order. If the iterator method returns FALSE
, then the
iteration is terminated.
.function.cbs.set.min-size: CBSSetMinSize()
is the
function used to change the minimum size of interest in a CBS. This
minimum size is used to determine whether to invoke the client
callbacks from CBSInsert()
and CBSDelete()
. This
function will invoke either the new
or delete
callback for all
blocks that are (in the semi-open interval) between the old and new
values. oldSize
and newSize
will be the same in these cases.
-
Res
CBSDescribe
(CBS cbs, mps_lib_FILE *stream)¶
.function.cbs.describe: CBSDescribe()
is a function
that prints a textual representation of the CBS to the given stream,
indicating the contiguous ranges in order, as well as the structure of
the underlying splay tree implementation. It is provided for debugging
purposes only.
.function.cbs.block.base: The CBSBlockBase()
function
returns the base of the range represented by the CBSBlock
.
This function may not be called from the delete callback when the
block is being deleted entirely.
Note
The value of the base of a particular CBSBlock
is not
guaranteed to remain constant across calls to CBSDelete()
and CBSInsert()
, regardless of whether a callback is
invoked.
.function.cbs.block.limit: The CBSBlockLimit()
function returns the limit of the range represented by the
CBSBlock
. This function may not be called from the delete
callback when the block is being deleted entirely.
Note
The value of the limit of a particular CBSBlock
is not
guaranteed to remain constant across calls to CBSDelete()
and CBSInsert()
, regardless of whether a callback is
invoked.
.function.cbs.block.size: The CBSBlockSize()
function
returns the size of the range represented by the CBSBlock
.
This function may not be called from the delete
callback when the
block is being deleted entirely.
Note
The value of the size of a particular CBSBlock
is not
guaranteed to remain constant across calls to CBSDelete()
and CBSInsert()
, regardless of whether a callback is
invoked.
-
Res
CBSBlockDescribe
(CBSBlock block, mps_lib_FILE *stream)¶
.function.cbs.block.describe: The CBSBlockDescribe()
function prints a textual representation of the CBSBlock
to
the given stream. It is provided for debugging purposes only.
-
Bool
CBSFindFirst
(Addr *baseReturn, Addr *limitReturn, CBS cbs, Size size, CBSFindDelete findDelete)¶
.function.cbs.find.first: The CBSFindFirst()
function
locates the first block (in address order) within the CBS of at least
the specified size, and returns its range. If there are no such
blocks, it returns FALSE
. It optionally deletes the top, bottom,
or all of the found range, depending on the findDelete
argument
(this saves a separate call to CBSDelete()
, and uses the
knowledge of exactly where we found the range), which must come from
this enumeration:
enum {
CBSFindDeleteNONE, /* don't delete after finding */
CBSFindDeleteLOW, /* delete precise size from low end */
CBSFindDeleteHIGH, /* delete precise size from high end */
CBSFindDeleteENTIRE /* delete entire range */
};
-
Bool
CBSFindLast
(Addr *baseReturn, Addr *limitReturn, CBS cbs, Size size, CBSFindDelete findDelete)¶
.function.cbs.find.last: The CBSFindLast()
function
locates the last block (in address order) within the CBS of at least
the specified size, and returns its range. If there are no such
blocks, it returns FALSE
. Like CBSFindFirst()
, it
optionally deletes the range.
.function.cbs.find.largest: The CBSFindLargest()
function locates the largest block within the CBS, and returns its
range. If there are no blocks, it returns FALSE
. Like
CBSFindFirst()
, it optionally deletes the range (specifying
CBSFindDeleteLOW
or CBSFindDeleteHIGH
has the same effect as
CBSFindDeleteENTIRE
).
3.6. Alignment¶
.align: When mayUseInline
is specified to permit inline
data structures and hence avoid losing memory in low memory
situations, the alignments that the CBS supports are constrained by
three requirements:
The smallest possible range (namely one that is the alignment in size) must be large enough to contain a single
void *
pointer (see .impl.low-mem.inline.grain);Any larger range (namely one that is at least twice the alignment in size) must be large enough to contain two
void *
pointers (see .impl.low-mem.inline.block);It must be valid on all platforms to access a
void *
pointer stored at the start of an aligned range.
All alignments that meet these requirements are aligned to
sizeof(void *)
, so we take that as the minimum alignment.
3.7. Implementation¶
.impl: Note that this section is concerned with describing various aspects of the implementation. It does not form part of the interface definition.
3.7.1. Size change callback protocol¶
.impl.callback: The size change callback protocol concerns
the mechanism for informing the client of the appearance and
disappearance of interesting ranges. The intention is that each range
has an identity (represented by the CBSBlock
). When blocks
are split, the larger fragment retains the identity. When blocks are
merged, the new block has the identity of the larger fragment.
.impl.callback.delete: Consider the case when the minimum
size is minSize
, and CBSDelete()
is called to remove a
range of size middle
. The two (possibly non-existant) neighbouring
ranges have (possibly zero) sizes left
and right
. middle
is part
of the CBSBlock
middleBlock
.
.impl.callback.delete.delete: The delete
callback will be
called in this case if and only if:
left + middle + right >= minSize && left < minSize && right < minSize
That is, the combined range is interesting, but neither remaining fragment is. It will be called with the following parameters:
block
:middleBlock
oldSize
:left + middle + right
newSize
:left >= right ? left : right
.impl.callback.delete.new: The new
callback will be
called in this case if and only if:
left >= minSize && right >= minSize
That is, both remaining fragments are interesting. It will be called with the following parameters:
block
: a new blockoldSize
:0
newSize
:left >= right ? right : left
.impl.callback.delete.shrink: The shrink callback will be called in this case if and only if:
left + middle + right >= minSize && (left >= minSize || right >= minSize)
That is, at least one of the remaining fragments is still interesting. It will be called with the following parameters:
block
:middleBlock
oldSize
:left + middle + right
newSize
:left >= right ? left : right
.impl.callback.insert: Consider the case when the minimum
size is minSize
, and CBSInsert()
is called to add a range
of size middle
. The two (possibly non-existant) neighbouring
blocks are leftBlock
and rightBlock
, and have (possibly zero)
sizes left
and right
.
.impl.callback.insert.delete: The delete
callback will be
called in this case if and only if:
left >= minSize && right >= minSize
That is, both neighbours were interesting. It will be called with the following parameters:
block
:left >= right ? rightBlock : leftBlock
oldSize
:left >= right ? right : left
newSize
:0
.impl.callback.insert.new: The new
callback will be
called in this case if and only if:
left + middle + right >= minSize && left < minSize && right < minSize
That is, the combined block is interesting, but neither neighbour was. It will be called with the following parameters:
block
:left >= right ? leftBlock : rightBlock
oldSize
:left >= right ? left : right
newSize
:left + middle + right
.impl.callback.insert.grow: The grow
callback will be
called in this case if and only if:
left + middle + right >= minSize && (left >= minSize || right >= minSize)
That is, at least one of the neighbours was interesting. It will be called with the following parameters:
block
:left >= right ? leftBlock : rightBlock
oldSize
:left >= right ? left : right
newSize
:left + middle + right
3.7.2. Splay tree¶
.impl.splay: The CBS is principally implemented using a splay tree (see design.mps.splay). Each splay tree node is embedded in a CBSBlock that represents a semi-open address range. The key passed for comparison is the base of another range.
.impl.splay.fast-find: CBSFindFirst()
and
CBSFindLast()
use the update/refresh facility of splay trees
to store, in each CBSBlock
, an accurate summary of the
maximum block size in the tree rooted at the corresponding splay node.
This allows rapid location of the first or last suitable block, and
very rapid failure if there is no suitable block.
.impl.find-largest: CBSFindLargest()
simply finds out
the size of the largest block in the CBS from the root of the tree
(using SplayRoot()
), and does SplayFindFirst()
for a
block of that size. This is O(log(n)) in the size of the free list,
so it’s about the best you can do without maintaining a separate
priority queue, just to do CBSFindLargest()
. Except when the
emergency lists (see .impl.low-mem) are in use, they are
also searched.
3.7.3. Low memory behaviour¶
.impl.low-mem: Low memory situations cause problems when the
CBS tries to allocate a new CBSBlock
structure for a new
isolated range as a result of either CBSInsert()
or
CBSDelete()
, and there is insufficient memory to allocation
the CBSBlock
structure:
.impl.low-mem.no-inline: If mayUseInline
is FALSE
,
then the range is not added to the CBS, and the call to
CBSInsert()
or CBSDelete()
returns ResMEMORY
.
.impl.low-mem.inline: If mayUseInline
is TRUE
:
.impl.low-mem.inline.block: If the range is large enough to
contain an inline block descriptor consisting of two pointers, then it
is kept on an emergency block list. The CBS will eagerly attempt to
add this block back into the splay tree during subsequent calls to
CBSInsert()
and CBSDelete()
. The CBS will also keep
its emergency block list in address order, and will coalesce this list
eagerly. Some performance degradation will be seen when the emergency
block list is in use. Ranges on this emergency block list will not be
made available to the CBS’s client via callbacks. CBSIterate()
and CBSIterateLarge()
will not iterate over ranges on this
list.
.impl.low-mem.inline.block.structure: The two pointers stored
are to the next such block (or NULL
), and to the limit of the
block, in that order.
.impl.low-mem.inline.grain: Otherwise, the range must be
large enough to contain an inline grain descriptor consisting of one
pointer, then it is kept on an emergency grain list. The CBS will
eagerly attempt to add this grain back into either the splay tree or
the emergency block list during subsequent calls to
CBSInsert()
and CBSDelete()
. The CBS will also keep
its emergency grain list in address order. Some performance
degradation will be seen when the emergency grain list is in use.
Ranges on this emergency grain list will not be made available to the
CBS’s client via callbacks. CBSIterate()
and
CBSIterateLarge()
will not iterate over ranges on this list.
.impl.low-mem.inline.grain.structure: The pointer stored is
to the next such grain, or NULL
.
3.7.4. The CBS block¶
.impl.cbs.block: The block contains a base-limit pair and a splay tree node.
.impl.cbs.block.special: The base and limit may be equal if the block is halfway through being deleted.
.impl.cbs.block.special.just: This conflates values and status, but is justified because block size is very important.
3.8. Testing¶
.test: The following testing will be performed on this module:
.test.cbstest: There is a stress test for this module in
impl.c.cbstest. This allocates a large block of memory and
then simulates the allocation and deallocation of ranges within this
block using both a CBS
and a BT
. It makes both
valid and invalid requests, and compares the CBS
response to
the correct behaviour as determined by the BT
. It also
iterates the ranges in the CBS
, comparing them to the
BT
. It also invokes the CBSDescribe()
method, but
makes no automatic test of the resulting output. It does not currently
test the callbacks.
.test.pool: Several pools (currently MVT (Manual Variable Temporal) and MVFF (Manual Variable First Fit)) are implemented on top of a CBS. These pool are subject to testing in development, QA, and are/will be heavily exercised by customers.
3.9. Notes for future development¶
.future.not-splay: The initial implementation of CBSs is based on splay trees. It could be revised to use any other data structure that meets the requirements (especially .req.fast).
.future.hybrid: It would be possible to attenuate the problem
of .risk.overhead (below) by using a single word bit set to
represent the membership in a (possibly aligned) word-width of grains.
This might be used for block sizes less than a word-width of grains,
converting them when they reach all free in the bit set. Note that
this would make coalescence slightly less eager, by up to
(word-width - 1)
.
3.10. Risks¶
.risk.overhead: Clients should note that the current implementation of CBSs has a space overhead proportional to the number of isolated contiguous ranges. [Four words per range.] If the CBS contains every other grain in an area, then the overhead will be large compared to the size of that area. [Four words per two grains.] See .future.hybrid for a suggestion to solve this problem. An alternative solution is to use CBSs only for managing long ranges.
3.11. Proposed hybrid implementation¶
Note
The following relates to a pending re-design and does not yet relate to any working source version. GavinM 1998-09-25
The CBS system provides its services by combining the services provided by three subsidiary CBS modules:
CBSST
– Splay Tree: Based on out-of-line splay trees; must allocate to insert isolated, which may therefore fail.CBSBL
– Block List: Based on a singly-linked list of variable sized ranges with inline descriptors; ranges must be at least large enough to store the inline descriptor.CBSGL
– Grain List: Based on a singly-linked list of fixed size ranges with inline descriptors; the ranges must be the alignment of the CBS.
The three sub-modules have a lot in common. Although their methods are not invoked via a dispatcher, they have been given consistent interfaces, and consistent internal appearance, to aid maintenance.
Methods supported by sub-modules (not all sub-modules support all methods):
MergeRange
– Finds any ranges in the specific CBS adjacent to the supplied one. If there are any, it extends the ranges, possibly deleting one of them. This cannot fail, but should returnFALSE
if there is an intersection between the supplied range and a range in the specific CBS.InsertIsolatedRange
– Adds a range to the specific CBS that is not adjacent to any range already in there. Depending on the specific CBS, this may be able to fail for allocation reasons, in which case it should returnFALSE
. It shouldAVER()
if the range is adjacent to or intersects with a range already there.RemoveAdjacentRanges
– Finds and removes from the specific CBS any ranges that are adjacent to the supplied range. Should returnFALSE
if the supplied range intersects with any ranges already there.DeleteRange
– Finds and deletes the supplied range from the specific CBS. Returns a tri-state result:Success
– The range was successfully deleted. This may have involved the creation of a new range, which should be done viaCBSInsertIsolatedRange
.ProtocolError
– Either some non-trivial strict subset of the supplied range was in the specific CBS, or a range adjacent to the supplied range was in the specific CBS. Either of these indicates a protocol error.NoIntersection
– The supplied range was not found in the CBS. This may or not be a protocol error, depending on the invocation context.
FindFirst
– Returns the first (in address order) range in the specific CBS that is at least as large as the supplied size, orFALSE
if there is no such range.FindFirstBefore
– AsFindFirst
, but only finds ranges prior to the supplied address.FindLast
– AsFindFirst
, but finds the last such range in address order.FindLastAfter
–FindLast
equivalent ofFindFirstBefore
.Init
– Initialise the control structure embedded in the CBS.Finish
– Finish the control structure embedded in the CBS.InlineDescriptorSize
– Returns the aligned size of the inline descriptor.Check
– Checks the control structure embedded in the CBS.
The CBS supplies the following utilities:
CBSAlignment
– Returns the alignment of the CBS.CBSMayUseInline
– Returns whether the CBS may use the memory in the ranges stored.CBSInsertIsolatedRange
– Wrapper forCBS*InsertIsolatedRange
.
Internally, the CBS*
sub-modules each have an internal structure
CBS*Block
that represents an isolated range within the module. It
supports the following methods (for sub-module internal use):
BlockBase
– Returns the base of the associated range;BlockLimit
BlockRange
BlockSize