7. AWL (Automatic Weak Linked)

AWL is an automatically managed non-moving pool class that may contain weak references (1).

The purpose of this pool class is to allow the client to implement weak-key, weak-value, and doubly weak hash tables.

In a weak-key hash table, the keys are weakly referenced, so their presence in the table will not prevent the key object from being garbage collected. Once the key is no longer reachable, weak references to it may get splatted (that is, replaced with null pointers). Once that has happened, the client program can’t get at the value corresponding to the key any more, so the implementation is free to splat the value slot as well.

AWL allows the implementation to splat the value slot at the same time that the weak key slot is splatted. (Or the other way around for weak-value tables.) See Dependent objects.

See Weak hash tables in the Advanced topics section of the user guide for a detailed example of using this pool class.

Note

AWL is the only pool in the open source MPS that allows its formatted objects to contain weak references. It was designed to support the weak hash tables in Open Dylan, and may be awkward to use for other use cases. If you need more general handling of weak references, contact us.

7.1. AWL properties

7.2. Dependent objects

In order to support prompt deletion of values in a weak-key hash table when the key is splatted (and prompt deletion of keys in a weak-value hash table), an AWL pool allows each object to have a dependent object. (This is where the “Linked” in the name of the pool class comes from.)

The dependent object is specified by the MPS_KEY_AWL_FIND_DEPENDENT keyword argument to mps_pool_create_k() when creating an AWL pool. This is a function of type mps_awl_find_dependent_t that takes the address of an object in the pool and returns the address of its dependent object (or a null pointer if there is no corresponding dependent object).

When scanning an object in an AWL pool, the MPS ensures that the dependent object is not protected. This means that the scan method in the pool’s object format can read or write the dependent object.

If an object contains a reference to its dependent object, you should fix that reference, and be aware that if it is a weak reference then it may be splatted when the dependent object dies.

The way you would normally use this feature in a weak hash table would be to put the table’s keys in one object, and its values in another. (This would be necessary in any case, because the MPS does not support a mixture of exact references and weak references (1) in the same object.) The dependent object for the keys objects is the values object, and vice versa (if necessary). The scan method looks out for the splatting of a reference, and when this is detected, it splats the corresponding reference in the dependent object.

For example:

obj_t obj_deleted;              /* deleted entry in hash table */

typedef struct weak_array_s {
    struct weak_array_s *dependent;
    size_t length;              /* tagged as "length * 2 + 1" */
    obj_t slot[1];
} weak_array_s, *weak_array_t;

typedef weak_table_s {
    type_s type;                /* TYPE_WEAK_TABLE */
    weak_array_t keys, values;
} weak_table_s, *weak_table_t;

mps_addr_t weak_array_find_dependent(mps_addr_t addr)
{
    weak_array_t a = addr;
    return a->dependent;
}

mps_res_t weak_array_scan(mps_ss_t ss, mps_addr_t base, mps_addr_t limit)
{
    MPS_SCAN_BEGIN(ss) {
        while (base < limit) {
            mps_addr_t p;
            weak_array_t a = base;
            size_t i, length = a->length >> 1; /* untag */
            p = a->dependent;
            MPS_FIX12(ss, &p);
            a->dependent = p;
            for (i = 0; i < length; ++i) {
                p = a->slot[i];
                if (MPS_FIX1(ss, p)) {
                    mps_res_t res = MPS_FIX2(ss, &p);
                    if (res != MPS_RES_OK) return res;
                    if (p == NULL && a->dependent) {
                        /* key/value was splatted: splat value/key too */
                        a->dependent->slot[i] = obj_deleted;
                        a->slot[i] = obj_deleted;
                    } else {
                        a->slot[i] = p;
                    }
                }
            }
            base += offsetof(weak_array_s, slot) + a->length * sizeof a->slot[0];
        }
    } MPS_SCAN_END(ss);
    return MPS_RES_OK;
}

Note

The length field of the weak_array_s structure contains the value length * 2 + 1 so that it cannot be mistaken for a pointer. See Caution below.

7.3. Protection faults

AWL has another special power: it enables better handing of protection faults on weak objects (objects containing weak references (1)).

To explain the benefit we first need to describe the problem. The MPS uses a read barrier to perform incremental garbage collection. When the client program tries to read an object containing weak references (1), the MPS may have protected it so that the MPS can process the object before the client gets to see it.

The problem is that the client program may try to access a weak object at a point in the collection cycle when the MPS cannot yet determine the status of the objects that the weak object refers to. What the MPS does in this situation is assume that all the referenced objects are going to live. This assumption is correct but conservative; it may result in objects that are weakly referenced staying alive for longer than they need to. In the worst case this can result in a very large amount of memory being used by objects that are no longer needed.

In order to combat this problem the MPS sometimes does the following: Instead of processing the entire weak object and unprotecting it, so that the client program can access the object, the MPS may emulate the processor instruction. When this happens, the MPS doesn’t process the entire weak object; it only processes the exact location that was being accessed (typically a single word). It emulates the processor instruction, and it keeps the object protected. This happens invisibly from the client program’s perspective: it’s exactly as if the instruction executed as normal.

Naturally this emulation business is delicate and involves staring at the most badly written parts of low-level processor architecture manuals for days.

Emulation of accesses to protected objects happens when all of the following are true:

  1. The object is a weak object allocated in an AWL pool.

  2. The MPS is running on Linux/IA-32 or Windows/IA-32. Extending this list to new (reasonable) operating systems should be tolerable (for example, macOS/IA-32). Extending this to new processor architectures requires more work.

  3. The processor instruction that is accessing the object is of a suitable simple form. The MPS doesn’t contain an emulator for all possible instructions that might access memory, so currently it only recognizes and emulates a simple MOV from memory to a register or vice-versa.

Contact us if you need emulation of access to weak references for new operating systems, processor architectures, or memory access instructions.

7.4. Caution

Because of the instruction emulation described in Protection faults above, AWL places the following restriction on the format of objects allocated in it:

  • Each slot in an object must either be a valid word-aligned reference, or else the bottom bits of the word must be non-zero so that it does not look like an aligned pointer.

    “Aligned pointer” means a word whose numeric value (that is, its value when treated as an unsigned integer) is a multiple of the size of a pointer. If you’re using a 64-bit architecture, that means that an aligned pointer is a multiple of 8 and its bottom three bits are zero.

    The bottom line is that references from an object in an AWL pool must be untagged and aligned, and integers must be tagged with a non-zero tag.

Normally one would cope with this restriction by allocating the table metadata in a pool belonging to another pool class, and only allocating the arrays of keys and values in an AWL pool. See the example above.

7.5. AWL interface

#include "mpscawl.h"
mps_pool_class_t mps_class_awl(void)

Return the pool class for an AWL (Automatic Weak Linked) pool.

When creating an AWL pool, mps_pool_create_k() requires one keyword argument:

It accepts three optional keyword arguments:

  • MPS_KEY_AWL_FIND_DEPENDENT (type mps_awl_find_dependent_t) is a function that specifies how to find the dependent object for an object in the pool. This defaults to a function that always returns NULL (meaning that there is no dependent object).

  • MPS_KEY_CHAIN (type mps_chain_t) specifies the generation chain for the pool. If not specified, the pool will use the arena’s default chain.

  • MPS_KEY_GEN (type unsigned) specifies the generation in the chain into which new objects will be allocated. If you pass your own chain, then this defaults to 0, but if you didn’t (and so use the arena’s default chain), then an appropriate generation is used.

    Note that AWL does not use generational garbage collection, so blocks remain in this generation and are not promoted.

For example:

MPS_ARGS_BEGIN(args) {
    MPS_ARGS_ADD(args, MPS_KEY_FORMAT, fmt);
    MPS_ARGS_ADD(args, MPS_KEY_AWL_FIND_DEPENDENT, find_dependent);
    res = mps_pool_create_k(&pool, arena, mps_class_awl(), args);
} MPS_ARGS_END(args);

When creating an allocation point on an AWL pool, mps_ap_create_k() accepts one optional keyword argument:

For example:

MPS_ARGS_BEGIN(args) {
    MPS_ARGS_ADD(args, MPS_KEY_RANK, mps_rank_weak());
    res = mps_ap_create_k(&ap, awl_pool, args);
} MPS_ARGS_END(args);
mps_addr_t (*mps_awl_find_dependent_t)(mps_addr_t addr)

The type of functions that find the dependent object for an object in an AWL pool.

addr is the address of an object in an AWL pool.

Returns the address of the corresponding dependent object, or a null pointer if there is none.

The dependent object need not be in memory managed by the MPS, but if it is, then it must be in a non-moving pool in the same arena as addr.