.readership: Any MPS developer.
.intro: This is the design of the Bit Tables module. A Bit Table is a linear array of bits. A Bit Table of length n is indexed using an integer from 0 up to (but not including) n. Each bit in a Bit Table can hold either the value 0 (aka FALSE) or 1 (aka TRUE). A variety of operations are provided including: set, reset, and retrieve, individual bits; set and reset a contiguous range of bits; search for a contiguous range of reset bits; making a “negative image” copy of a range.
.hist.0-3: The history for versions 0-3 is lost pending possible reconstruction. David Jones, 1997-03-04.
.hist.4: Prepared for review. Added full requirements section. Made notation more consistent throughout. Documented all functions. David Jones, 1999-04-29.
.hist.5: Converted from MMInfo database design document. Richard Brooksby, 2002-06-07.
.hist.6: Converted to reStructuredText. Gareth Rees, 2013-03-12.
.def.set: Set
Used as a verb meaning to assign the value 1 or TRUE to a bit. Used descriptively to denote a bit containing the value 1. Note 1 and TRUE are synonyms in MPS C code (see design.mps.type(0).bool.value).
.def.reset: Reset
Used as a verb meaning to assign the value 0 or FALSE to a bit. Used descriptively to denote a bit containing the value 0. Note 0 and FALSE are synonyms in MPS C code (see design.mps.type(0).bool.value).
Note
Consider using “fill/empty” or “mark/clear” instead of “set/reset”, set/reset is probably a hangover from drj’s z80 hacking days – drj 1999-04-26
.def.bt: Bit Table
A Bit Table is a mapping from [0, n) to {0,1} for some n, represented as a linear array of bits.
..def.bt.justify: They are called bit tables because a single bit is used to encode whether the image of a particular integer under the map is 0 or 1.
.def.range: Range
A contiguous sequence of bits in a Bit Table. Ranges are typically specified as a base–limit pair where the range includes the position specified by the base, but excludes that specified by the limit. The mathematical interval notation for half-open intervals, [base, limit), is used.
.req.bit: The storage for a Bit Table of n bits shall take no more than a small constant addition to the storage required for n bits. ..req.bit.why: This is so that clients can make some predictions about how much storage their algorithms use. A small constant is allowed over the minimal for two reasons: inevitable implementation overheads (such as only being able to allocate storage in multiples of 32 bits), extra storage for robustness or speed (such as signature and length fields).
.req.create: A means to create Bit Tables. .req.create.why: Obvious.
.req.destroy: A means to destroy Bit Tables. .req.destroy.why: Obvious.
.req.ops: The following operations shall be supported:
.req.ops.minimal.why: Get, Set, Reset, are the minimal operations. All possible mappings can be created and inspected using these operations.
.req.speed: Operations shall take no more than a few memory operations per bit manipulated. .req.speed.why: Any slower would be gratuitous.
.req.speed.fast: The following operations shall be very fast:
.req.speed.fast.find.short: FindShortResRange (the operation used to meet .req.ops.find.short.low) FindShortResRangeHigh (the operation used to meet .req.ops.find.short.high).
.req.speed.fast.find.short.why: These two are used by the client arena (design.mps.arena.client) and the VM arena (design.mps.arena.vm) for finding segments in page tables. The operation will be used sufficiently often that its speed will noticeably affect the overall speed of the MPS. They will be called with a length equal to the number of pages in a segment. Typical values of this length depend on the pool classes used and their configuration, but we can expect length to be small (1 to 16) usually. We can expect the Bit Table to be populated densely where it is populated at all, that is set bits will tend to be clustered together in subranges.
.req.speed.fast.find.long: FindLongResRange (the operation used to meet .req.ops.find.long.low)
.req.speed.fast.find.long.why: Used in the allocator for PoolClassAWL (design.mps.poolawl(1)), PoolClassAMS (design.mps.poolams(2)), PoolClassEPVM (design.mps.poolepvm(0)). Of these AWL and EPVM have speed requirements. For AWL the length of range to be found will be the length of a Dylan table in words. According to mail.tony.1999-05-05.11-36(0), only <entry-vector> objects are allocated in AWL (though not all <entry-vector> objects are allocated in AWL), and the mean length of an <entry-vector> object is 486 Words. No data for EPVM alas.
.req.speed.fast.other.why: We might expect mark and sweep pools to make use of Bit Tables, the MPS has general requirements to support efficient mark and sweep pools, so that imposes general speed requirements on Bit Tables.
The following are not requirements but the current design could support them with little modification or does support them. Often they used to be requirements, but are no longer, or were added speculatively or experimentally but aren’t currently used.
.background: Originally Bit Tables were used and implemented by PoolClassLO (design.mps.poollo). It was decided to lift them out into a separate module when designing the Pool to manage Dylan Weak Tables which is also a mark and sweep pool and will make use of Bit Tables (see design.mps.poolawl). .background.analysis: analysis.mps.bt(0) contains some of the analysis of the design decisions that were and were not made in this document.
.clients: Bit Tables are used throughout the MPS but the important uses are in the client and VM arenas (design.mps.arena.client(0) and design.mps.arena.vm(1)) a bit table is used to record whether each page is free or not; several pool classes (PoolClassLO, PoolClassEPVM, PoolClassAMS) use bit tables to record which locations are free and also to store colour.
.over: Mostly, the design is as simple as possible. The significant complications are iteration (see .iteration below) and searching (see .fun.find-res-range below) because both of these are required to be fast.
.if.representation.abstract: A Bit Table is represented by the type BT.
.if.declare: The module declares a type BT and a prototype for each of the functions below. The type is declared in impl.h.mpmtypes, the prototypes are declared in impl.h.mpm. Some of the functions are in fact implemented as macros in the usual way (doc.mps.ref-man.if-conv(0).macro.std).
.if.general.index: Many of the functions specified below take indexes. If otherwise unspecified an index must be in the interval [0, n) (note, up to, but not including, n) where n is the number of bits in the relevant Bit Table (as passed to the BTCreate() function).
.if.general.range: Where a range is specified by two indexes (base and limit), the index base, which specifies the beginning of the range, must be in the interval [0, n), and the index limit, which specifies the end of the range, must be in the interval [1, n] (note can be n), and base must be strictly less than limit (empty ranges are not allowed). Sometimes i and j are used instead of base and limit.
.if.create: Attempts to create a table of length n in the arena control pool, putting the table in *btReturn. Returns ResOK if and only if the table is created OK. The initial values of the bits in the table are undefined (so the client should probably call BTResRange() on the entire range before using the BT). Meets .req.create.
.if.destroy: Destroys the table t, which must have been created with BTCreate(). The value of argument n must be same as the value of the argument passed to BTCreate(). Meets mps:ref:.req.destroy.
.if.size: BTSize(n) returns the number of bytes needed for a Bit Table of n bits. BTSize() is a macro, but (BTSize)(n) will assert if n exceeds COUNT_MAX - MPS_WORD_WIDTH + 1`. This is used by clients that allocate storage for the BT themselves. Before BTCreate() and BTDestroy() were implemented that was the only way to allocate a Bit Table, but is now deprecated.
.if.get: BTGet(t, i) returns the i-th bit of the table t (that is, the image of i under the mapping). Meets .req.ops.get.
.if.set: BTSet(t, i) sets the i-th bit of the table t (to 1). BTGet(t, i) will now return 1. Meets .req.ops.set.
.if.res: BTRes(t, i) resets the i-th bit of the table t (to 0). BTGet(t, i) will now return 0. Meets .req.ops.res.
.if.set-range: BTSetRange(t, base, limit) sets the range of bits [base, limit) in the table t. BTGet(t, x) will now return 1 for base ≤ x < limit. Meets .req.ops.range.set.
.if.res-range: BTResRange(t, base, limit) resets the range of bits [base, limit) in the table t. BTGet(t, x) will now return 0 for base ≤ x < limit. Meets .req.ops.range.res.
.if.test.range.set: Returns TRUE if all the bits in the range [base, limit) are set, FALSE otherwise. Meets .req.ops.test.range.set.
.if.test.range.reset: Returns TRUE if all the bits in the range [base, limit) are reset, FALSE otherwise. Meets .req.ops.test.range.reset.
.if.test.range.same: returns TRUE if BTGet(BTx,i) equals BTGet(BTy,i) for i in [base, limit), and FALSE otherwise. Meets .non-req.ops.test.range.same.
.if.find.general: There are four functions (below) to find reset ranges. All the functions have the same prototype (for symmetry):
Bool find(Index *baseReturn, Index *limitReturn,
BT bt,
Index searchBase, Index searchLimit,
Count length);
where bt is the Bit Table in which to search. searchBase and searchLimit specify a subset of the Bit Table to use, the functions will only find ranges that are subsets of [searchBase, searchLimit) (when set, *baseReturn will never be less than searchBase and *limitReturn will never be greater than searchLimit). searchBase and searchLimit specify a range that must conform to the general range requirements for a range [i, j), as per .if.general.range modified appropriately. length is the number of contiguous reset bits to find; it must not be bigger than searchLimit - searchBase (that would be silly). If a suitable range cannot be found the function returns FALSE (0) and leaves *baseReturn and *limitReturn untouched. If a suitable range is found then the function returns the range’s base in *baseReturn and its limit in *limitReturn and returns TRUE (1).
.if.find-short-res-range: Finds a range of reset bits in the table, starting at searchBase and working upwards. This function is intended to meet .req.ops.find.short.low so it will find the leftmost range that will do, and never finds a range longer than the requested length (the intention is that it will not waste time looking).
.if.find-short-res-range-high: Finds a range of reset bits in the table, starting at searchLimit and working downwards. This function is intended to meet .req.ops.find.short.high so it will find the rightmost range that will do, and never finds a range longer than the requested length.
.if.find-long-res-range: Finds a range of reset bits in the table, starting at searchBase and working upwards. This function is intended to meet .req.ops.find.long.low so it will find the leftmost range that will do and returns all of that range (which can be longer than the requested length).
.if.find-long-res-range-high: Finds a range of reset bits in the table, starting at searchLimit and working downwards. This function is intended to meet .req.ops.find.long.high so it will find the rightmost range that will do and returns all that range (which can be longer than the requested length).
.if.copy-range: Overwrites the i-th bit of toBT with the i-th bit of fromBT, for all i in [base, limit). Meets .req.ops.copy.simple.
.if.copy-offset-range: Overwrites the i-th bit of toBT with the j-th bit of fromBT, for all i in [toBase, toLimit) and corresponding j in [fromBase, fromLimit). Each of these ranges must be the same size. This might be significantly less efficient than BTCopyRange(). Meets .req.ops.copy.offset.
.if.copy-invert-range: Overwrites the i-th bit of toBT with the inverse of the i-th bit of fromBT, for all i in [base, limit). Meets .req.ops.copy.invert.
.datastructure: Bit Tables will be represented as (a pointer to) an array of Word. A plain array is used instead of the more usual design convention of implementing an abstract data type as a structure with a signature (see guide.impl.c.adt(0)). .datastructure.words.justify: The type Word is used as it will probably map to the object that can be most efficiently accessed on any particular platform. .datastructure.non-adt.justify: The usual abstract data type convention was not followed because (i) The initial design (drj) was lazy, (ii) Bit Tables are more likely to come in convenient powers of two with the extra one or two words overhead. However, the loss of checking is severe. Perhaps it would be better to use the usual abstract data type style.
.fun.size: BTSize(). Since a Bit Table is an array of Word, the size of a Bit Table of n bits is simply the number of words that it takes to store n bits times the number of bytes in a word. This is ceiling(n/MPS_WORD_WIDTH)*sizeof(Word). .fun.size.justify: Since there can be at most MPS_WORD_WIDTH - 1 unused bits in the entire table, this satisfies .req.bit.
.index: The designs for the following functions use a decomposition of a bit-index, i, into two parts, iw, ib.
.index.justify.dubious: The above justifications are dubious; gcc 2.7.2 (with -O2) running on a sparc (zaphod) produces identical code for the following two functions:
unsigned long f(unsigned long i) {
return i/32 + i%32;
}
unsigned long g(unsigned long i) {
return (i>>5) + (i&31);
}
.iteration: Many of the following functions involve iteration over ranges in a Bit Table. This is performed on whole words rather than individual bits, whenever possible (to improve speed). This is implemented internally by the macros ACT_ON_RANGE() and ACT_ON_RANGE_HIGH() for iterating over the range forwards and backwards respectively. These macros do not form part of the interface of the module, but are used extensively in the implementation. The macros are often used even when speed is not an issue because it simplifies the implementation and makes it more uniform. The iteration macros take the parameters base, limit, single_action, bits_action, and word_action:
.iteration.exit: The expansion of the single_action, bits_action, and word_action macros is allowed to contain return or goto to terminate the iteration early. This is used by the test (.fun.test.range.set) and find (.fun.find) operations.
.iteration.small: If the range is sufficiently small only the single_action macro will be used, as this is more efficient in practice. The choice of what constitutes a small range is made entirely on the basis of experimental performance results (and currently, 1999-04-27, a “small range” is 6 bits or fewer. See change.mps.epcore.brisling.160181 for some justification). Otherwise (for a bigger range) bits_action is used on the part words at either end of the range (or the whole of the range it if it fits in a single word), and word_action is used on the words that comprise the inner portion of the range.
The implementation of ACT_ON_RANGE() (and ACT_ON_RANGE_HIGH()) is simple enough. It decides which macros it should invoke and invokes them. single_action and word_action are invoked inside loops.
.fun.get: BTGet(). The bit-index will be converted in the usual way, see .index. The relevant Word will be read out of the Bit Table and shifted right by the sub-Word index (this brings the relevant bit down to the least significant bit of the Word), the Word will then be masked with 1, producing the answer.
In both BTSet() and BTRes() a mask is constructed by shifting 1 left by the sub-word-index (see .index). For BTSet() the mask is or-ed into the relevant word (thereby setting a single bit). For BTRes() the mask is inverted and and-ed into the relevant word (thereby resetting a single bit).
.fun.set-range: BTSetRange(). ACT_ON_RANGE() (see .iteration above) is used with macros that set a single bit (using BTSet()), set a range of bits in a word, and set a whole word.
.fun.res-range: BTResRange() This is implemented similarly to BTSetRange() (.fun.set-range) except using BTRes() and reverse bit-masking logic.
.fun.test.range.set: BTIsSetRange(). ACT_ON_RANGE() (see .iteration above) is used with macros that test whether all the relevant bits are set; if some of the relevant bits are not set then return FALSE is used to terminate the iteration early and return from the BTIsSetRange() function. If the iteration completes then TRUE is returned.
.fun.test.range.reset: BTIsResRange(). As for BTIsSetRange() (.fun.test.range.set above) but testing whether the bits are reset.
.fun.test.range.same: BTRangesSame(). As for BTIsSetRange() (.fun.test.range.set above) but testing whether corresponding ranges in the two Bit Tables are the same. Note there are no speed requirements, but ACT_ON_RANGE() is used for simplicity and uniformity.
.fun.find: The four external find functions (BTFindShortResRange(), BTFindShortResRangeHigh(), BTFindLongResRange(), BTFindLongResRangeHigh()) simply call through to one of the two internal functions: BTFindResRange() and BTFindResRangeHigh().
There are two length parameters, one specifying the minimum length of the range to be found, the other the maximum length. For BTFindShort() and BTFindShortHigh(), maxLength is equal to minLength when passed; for BTFindLong() and BTFindLongHigh(), maxLength` is equal to the maximum possible range, namely ``searchLimit - searchBase.
.fun.find-res-range: BTFindResRange(). Iterate within the search boundaries, identifying candidate ranges by searching for a reset bit. The Boyer–Moore algorithm is used (it’s particularly easy to implement when there are only two symbols, 0 and 1, in the alphabet). For each candidate range, iterate backwards over the bits from the end of the range towards the beginning. If a set bit is found, this candidate has failed and a new candidate range is selected. If when scanning for the set bit a range of reset bits was found before finding the set bit, then this (small) range of reset bits is used as the start of the next candidate. Additionally the end of this small range of reset bits (the end of the failed candidate range) is remembered so that we don’t have to iterate over this range again. But if no reset bits were found in the candidate range, then iterate again (starting from the end of the failed candidate) to look for one. If during the backwards search no set bit is found, then we have found a sufficiently large range of reset bits; now extend the valid range as far as possible up to the maximum length by iterating forwards up to the maximum limit looking for a set bit. The iterations make use of the ACT_ON_RANGE() and ACT_ON_RANGE_HIGH() macros, which can use goto to effect an early termination of the iteration when a set/reset (as appropriate) bit is found. The macro ACTION_FIND_SET_BIT() is used in the iterations. It efficiently finds the first (that is, with lowest index or weight) set bit in a word or subword.
.fun.find-res-range.improve: Various other performance improvements have been suggested in the past, including some from request.epcore.170534. Here is a list of potential improvements which all sound plausible, but which have not led to performance improvements in practice:
.fun.find-res-range-high: BTFindResRangeHigh(). Exactly the same algorithm as in BTFindResRange() (see .fun.find-res-range above), but moving over the table in the opposite direction.
.fun.copy-simple-range: BTCopyRange(). Uses ACT_ON_RANGE() (see .iteration above) with the obvious implementation. Should be fast.
.fun.copy-offset-range: BTCopyOffsetRange(). Uses a simple iteration loop, reading bits with BTGet() and setting them with BTSet(). Doesn’t use ACT_ON_RANGE() because the two ranges will not, in general, be similarly word-aligned.
.fun.copy-invert-range: BTCopyInvertRange(). Uses ACT_ON_RANGE() (see .iteration above) with the obvious implementation. Should be fast—although there are no speed requirements.
.test: The following tests are available or have been used during development.
.test.btcv: btcv.c. This is supposed to be a coverage test, intended to execute all of the module’s code in at least some minimal way.
.test.cbstest: cbstest.c. This was written as a test of the CBS module (design.mps.cbs(2)). It compares the functional operation of a CBS with that of a BT so is a good functional test of either module.
.test.mmqa.120: MMQA_test_function!210.c. This is used because it has a fair amount of segment allocation and freeing so exercises the arena code that uses Bit Tables.
.test.bttest: bttest.c. This is an interactive test that can be used to exercise some of the BT functionality by hand.
.test.dylan: It is possible to modify Dylan so that it uses Bit Tables more extensively. See change.mps.epcore.brisling.160181 TEST1 and TEST2.