1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
.. Sources:

     `<https://info.ravenbrook.com/project/mps/doc/2002-06-18/obsolete-mminfo/mmdoc/protocol/mps/scanning/>`_

.. index::
   single: scanning; introduction
   single: object format; scan method

.. _topic-scanning:

Scanning
========

:term:`Scanning <scan>` is the process of identifying the
:term:`references` in a block of memory and
:term:`"fixing" <fix>` them. It's the process at the heart of the
Memory Pool System, and the most critical of the memory management
functions that have to be implemented by the :term:`client program`.

Scanning is performed for two tasks: during :term:`tracing <trace>`,
blocks are scanned in order to follow references, and so determine
which blocks are :term:`reachable` and which are not. After objects
have been moved in memory, blocks are scanned in order to identify
references that need to be updated to point to the new locations of
these objects. Both tasks use the same scanning protocol, described
here.


.. index::
   single: scanning; protocol

.. _topic-scanning-protocol:

Scanning protocol
-----------------

There are several types of scanning functions (the :term:`scan method`
in an :term:`object format`, of type :c:type:`mps_fmt_scan_t`, and
root scanning functions of various types) but all take a :term:`scan
state` argument of type :c:type:`mps_ss_t`, and a description of a
region to be scanned. They must carry out the following steps:

1. Call the macro :c:func:`MPS_SCAN_BEGIN` on the scan state.

2. For each reference in the region:

   1. Call :c:func:`MPS_FIX1`, passing the scan state and the
      reference.

   2. If :c:func:`MPS_FIX1` returns false, the reference is not of
      interest to the MPS. Proceed to the next reference in the
      region.

   3. If :c:func:`MPS_FIX1` returns true, the reference is of interest
      to the MPS. Call :c:func:`MPS_FIX2`, passing the scan state and
      a pointer to a location containing the reference.

   4. If :c:func:`MPS_FIX2` returns a :term:`result code` other than
      :c:func:`MPS_RES_OK`, return this result code from the scanning
      function as soon as practicable.

   5. If :c:func:`MPS_FIX2` returns :c:macro:`MPS_RES_OK`, it may have
      updated the reference. If necessary, make sure that the updated
      reference is stored back to the region being scanned.

3. Call the macro :c:func:`MPS_SCAN_END` on the scan state.

4. Return :c:macro:`MPS_RES_OK`.

This description of the protocol simplifies a number of important
details, which are covered in the following sections.


.. index::
   pair: scanning; tagged reference

.. _topic-scanning-tag:

Tagged references
-----------------

If your references are :term:`tagged <tagged reference>` (or otherwise
"encrypted"), then you must remove the tag (or decrypt them) before
passing them to :c:func:`MPS_FIX1` and :c:func:`MPS_FIX2`.

The reference passed to :c:func:`MPS_FIX2` must be the address of the
base of the block referred to (unless the referent belongs to an
:term:`object format` with :term:`in-band headers`, in which case it
must be a reference to the address just after the header).

However, :c:func:`MPS_FIX1` allows some leeway: if you pass it a
reference to the interior of an allocated block, then
:c:func:`MPS_FIX1` correctly determines whether a reference to the
block is of interest to the MPS.

This means that if your tag is in the low bits of the reference, you
may not have to remove it before calling :c:func:`MPS_FIX1`. For
example, if you use three tag bits, then your reference is at most
*base* + 7, and if your objects are at least 8 bytes long, then the
reference is within the object and need not be stripped. So your code
might look like this::

    if (MPS_FIX1(ss, obj->ref)) {
        /* strip the tag */
        mps_addr_t p = obj->ref & ~0x7;
        mps_res_t res = MPS_FIX2(ss, &p);
        if (res != MPS_RES_OK) return res;
        /* restore the tag and update reference */
        mps_word_t tag = obj->ref & 0x7;
        obj->ref = (obj_t)((char *)p + tag);
    }

This saves the cost of stripping the tag in the case that ``obj->ref``
is not of interest to the MPS.

Similarly, if you use interior pointers, you do not need to convert
them to base pointers before calling :c:func:`MPS_FIX1` (or, indeed,
before calling :c:func:`MPS_FIX2`, if the target of the referent
belongs to an :term:`object format` with :term:`in-band headers`).


.. index::
   pair: scanning; critical path

Critical path
-------------

Scanning is an operation on the critical path of the MPS and so it is
vital that it runs fast. The scanning protocol is designed to ensure
that as much of the scanning code can be run inline in the client
program as possible. In particular, the macro :c:func:`MPS_FIX1` does
not need to call into the MPS.

The purpose of :c:func:`MPS_FIX1` is to provide a fast check as to
whether a reference is "of interest" to the MPS. It is legitimate to
call this on any word: it does not even have to be an address. So if
you have a mixture of references and non-references, it might turn out
to be faster to call :c:func:`MPS_FIX1` on each word before you even
determine whether or not the word is a reference.

Whether this is in fact an optimization depends on the proportion of
references to non-references, on how often genuine references turn out
to be "of interest", and what kind of code the compiler has
generated. There is no substitute for measurement.

See :ref:`topic-critical`.

.. note::

    In one application with a high proportion of :term:`unboxed`
    values, it turned out to be fastest to check the tag and reject
    non-references before calling :c:func:`MPS_FIX1`.

.. warning::

    If you passed a word that might not be a reference to
    :c:func:`MPS_FIX1`, and it returned true, this might be a false
    positive. You must be certain that the alleged reference is
    genuine as well as "of interest" before passing it to
    :c:func:`MPS_FIX2`.

Another technique that can speed up scanning is to segregate objects
into pools whose object formats contain different scan methods. In
particular, if you can segregate objects that do not contain any
references into :term:`leaf object` pools like :ref:`pool-amcz`, these
objects do not need to be scanned at all.


.. index::
   pair: scanning; ambiguous reference

Ambiguous references
--------------------

If the references in the object being scanned are :term:`ambiguous
<ambiguous reference>` then :c:func:`MPS_FIX2` does not update the
reference (because it can't know if it's a genuine reference). The MPS
handles an ambiguous reference by :term:`pinning` the block pointed to
so that it cannot move.

You could use this fact to optimize the scan by avoiding the need to
reassemble and store the updated reference after calling
:c:func:`MPS_FIX2`

.. note::

    The MPS currently has no pools that support ambiguous references,
    so this cannot arise for the :term:`scan method` in an
    :term:`object format`, but :term:`root` scanning functions may
    encounter this case.


.. index::
   pair: scanning; unfixed reference

Unfixed references
------------------

The MPS does not require you to :term:`fix` all your :term:`references`. But if a reference is not fixed:

1. it does not keep its target alive (this might be acceptable if you
   know that the target is being kept alive for another reason, for
   example if it is in a :term:`manually managed <manual memory
   management>` pool, or if there is always another reference to the
   target that *is* fixed);

2. it does not get updated if the target moves (this might be
   acceptable if you know that the target cannot move, for example if
   it is in a :term:`non-moving <non-moving memory manager>` pool, or
   if it is :term:`pinned <pinning>` by an :term:`ambiguous
   reference`).

These optimizations can be tricky to make correct, and can make the
system fragile (for example, it may break if you start using a
different :term:`pool class`), so it usually safest to fix all
references.


.. index::
   single: scanning; example
   single: Scheme; scanning

Example: Scheme objects
-----------------------

Scanning tends to be a repetitive procedure and so you'll find it is
usually helpful to define macros to reduce the size of the source
code. The MPS provides a convenience macro :c:func:`MPS_FIX12` for the
common case of calling :c:func:`MPS_FIX1` and then immediately calling
:c:func:`MPS_FIX2` if the reference is "of interest".

.. note::

    Some compilers generate better code if you use
    :c:func:`MPS_FIX12`, and some if you use :c:func:`MPS_FIX1` and
    :c:func:`MPS_FIX2`. There's no substitute for measurement.

Here's the macro ``FIX`` defined by the toy Scheme interpreter::

    #define FIX(ref)                                                        \
        do {                                                                \
            mps_addr_t _addr = (ref); /* copy to local to avoid type pun */ \
            mps_res_t res = MPS_FIX12(ss, &_addr);                          \
            if (res != MPS_RES_OK) return res;                              \
            (ref) = _addr;                                                  \
        } while(0)

.. note::

    The comment refers to a temptation to write non-portable code that
    presents itself here. :c:func:`MPS_FIX2` takes a pointer to a
    location containing the reference (an argument of type
    ``mps_addr_t *``). It is tempting to take the address of the
    reference and cast it to this type. The behaviour of such a cast
    is not defined by the C standard. See :ref:`topic-interface-pun`.

Here's the Scheme scanner::

    static mps_res_t obj_scan(mps_ss_t ss, mps_addr_t base, mps_addr_t limit)
    {
        MPS_SCAN_BEGIN(ss) {
            while (base < limit) {
                obj_t obj = base;
                switch (obj->type.type) {
                    case TYPE_PAIR:
                        FIX(obj->pair.car);
                        FIX(obj->pair.cdr);
                        base = (char *)base + ALIGN(sizeof(pair_s));
                        break;
                    case TYPE_VECTOR: {
                        size_t i;
                        for (i = 0; i < obj->vector.length; ++i)
                            FIX(obj->vector.vector[i]);
                        base = (char *)base +
                            ALIGN(offsetof(vector_s, vector) +
                                  obj->vector.length * sizeof(obj->vector.vector[0]));
                        break;
                    }
                    /* ... and so on for the other types ... */
                    default:
                        assert(0);
                        fprintf(stderr, "Unexpected object on the heap\n");
                        abort();
                        return MPS_RES_FAIL;
                }
            }
        } MPS_SCAN_END(ss);
        return MPS_RES_OK;
    }

.. note::

    This scanner is a simple example intended to make the process
    clear to the reader. The scanning code and the object layout are
    not at all optimized.


.. index::
   single: scanning; interface

Scanning interface
------------------

.. c:type:: mps_ss_t

    The type of :term:`scan states`.

    A scan state represents the state of the current :term:`scan`. The
    MPS passes a scan state to the :term:`scan method` of an
    :term:`object format` when it needs to :term:`scan` for
    :term:`references` within a region of memory. The scan
    method must pass the scan state to :c:func:`MPS_SCAN_BEGIN` and
    :c:func:`MPS_SCAN_END` to delimit a sequence of fix operations,
    and to the functions :c:func:`MPS_FIX1`, :c:func:`MPS_FIX2` and
    :c:func:`MPS_FIX12` when fixing a :term:`reference`.


.. c:function:: MPS_SCAN_BEGIN(mps_ss_t ss)

    Within a :term:`scan method`, set up local information required
    by :c:func:`MPS_FIX1`, :c:func:`MPS_FIX2` and
    :c:func:`MPS_FIX12`. The local information persists until
    :c:func:`MPS_SCAN_END`.

    ``ss`` is the :term:`scan state` that was passed to the scan method.

    .. note::

        Between :c:func:`MPS_SCAN_BEGIN` and :c:func:`MPS_SCAN_END`,
        the scan state is in a special state, and must not be passed
        to a function. If you really need to do so, for example
        because you have an embedded structure shared between two scan
        methods, you must wrap the call with :c:func:`MPS_FIX_CALL` to
        ensure that the scan state is passed correctly.


.. c:function:: MPS_SCAN_END(mps_ss_t ss)

    Within a :term:`scan method`, terminate a block started by
    :c:func:`MPS_SCAN_BEGIN`.

    ``ss`` is the :term:`scan state` that was passed to the scan
    method.

    .. note::

        :c:func:`MPS_SCAN_END` ensures that the scan is completed, so
        successful termination of a scan must invoke it. However, in
        case of an error it is allowed to return from the scan
        method without invoking :c:func:`MPS_SCAN_END`.

    .. note::

        Between :c:func:`MPS_SCAN_BEGIN` and :c:func:`MPS_SCAN_END`, the
        scan state is in a special state, and must not be passed to a
        function. If you really need to do so, for example because you
        have an embedded structure shared between two scan methods, you
        must wrap the call with :c:func:`MPS_FIX_CALL` to ensure that the
        scan state is passed correctly.


.. c:function:: MPS_FIX_CALL(ss, call)

    Call a function from within a :term:`scan method`, between
    :c:func:`MPS_SCAN_BEGIN` and :c:func:`MPS_SCAN_END`, passing
    the :term:`scan state` correctly.

    ``ss`` is the scan state that was passed to the scan method.

    ``call`` is an expression containing a function call where ``ss``
    is one of the arguments.

    Returns the result of evaluating the expression ``call``.

    Between :c:func:`MPS_SCAN_BEGIN` and :c:func:`MPS_SCAN_END`, the
    scan state is in a special state, and must not be passed to a
    function. If you really need to do so, for example because you
    have a structure shared between two :term:`object formats`, you
    must wrap the call with :c:func:`MPS_FIX_CALL` to ensure that the
    scan state is passed correctly.

    In example below, the scan method ``obj_scan`` fixes the object's
    ``left`` and ``right`` references, but delegates the scanning of
    references inside the object's ``data`` member to the function
    ``data_scan``. In order to ensure that the scan state is passed
    correctly to ``data_scan``, the call must be wrapped in
    :c:func:`MPS_FIX_CALL`. ::

        mps_res_t obj_scan(mps_ss_t ss, mps_addr_t base, mps_addr_t limit)
        {
            obj_t obj;
            mps_res_t res;
            MPS_SCAN_BEGIN(ss) {
                for (obj = base; obj < limit; obj++) {
                    if (MPS_FIX12(ss, &obj->left) != MPS_RES_OK)
                        return res;
                    MPS_FIX_CALL(ss, res = data_scan(ss, &obj->data));
                    if (res != MPS_RES_OK)
                        return res;
                    if (MPS_FIX12(ss, &obj->right) != MPS_RES_OK)
                        return res;
                }
            } MPS_SCAN_END(ss);
            return MPS_RES_OK;
        }

    .. warning::

         Use of :c:func:`MPS_FIX_CALL` is best avoided, as it forces
         values out of registers. The gains in simplicity of the code
         need to be measured against the loss in performance.


.. index::
   single: scanning; fixing
   single: fixing; interface

Fixing interface
----------------

.. c:function:: mps_bool_t MPS_FIX1(mps_ss_t ss, mps_addr_t ref)

    Determine whether a :term:`reference` needs to be passed to
    :c:func:`MPS_FIX2`.

    ``ss`` is the :term:`scan state` that was passed to the
    :term:`scan method`.

    ``ref`` is the reference.

    Returns a truth value (:c:type:`mps_bool_t`) indicating whether
    ``ref`` is "interesting" to the MPS. If it returns true, the scan
    method must invoke :c:func:`MPS_FIX2` to :term:`fix` ``ref``.

    This macro must only be used within a :term:`scan method`, between
    :c:func:`MPS_SCAN_BEGIN` and :c:func:`MPS_SCAN_END`.

    .. note::

        If your reference is :term:`tagged <tagged reference>` or
        otherwise "encrypted", you must ensure that it points to a
        location within the target block before calling
        :c:func:`MPS_FIX1`. (Therefore, a small tag in the low bits
        need not be stripped.)

    .. note::

        In the case where the scan method does not need to do anything
        between :c:func:`MPS_FIX1` and :c:func:`MPS_FIX2`, you can use
        the convenience macro :c:func:`MPS_FIX12`.


.. c:function:: mps_res_t MPS_FIX12(mps_ss_t ss, mps_addr_t *ref_io)

    :term:`Fix` a :term:`reference`.

    This macro is a convenience for the case where :c:func:`MPS_FIX1`
    is immediately followed by :c:func:`MPS_FIX2`. The interface is
    the same as :c:func:`MPS_FIX2`.


.. c:function:: mps_res_t MPS_FIX2(mps_ss_t ss, mps_addr_t *ref_io)

    :term:`Fix` a :term:`reference`.

    ``ss`` is the :term:`scan state` that was passed to the scan method.

    ``ref_io`` points to the reference.

    Returns :c:macro:`MPS_RES_OK` if successful: in this case the
    reference may have been updated, and the scan method must continue
    to scan the :term:`block`. If it returns any other result, the
    :term:`scan method` must return that result as soon as possible,
    without fixing any further references.

    This macro must only be used within a :term:`scan method`, between
    :c:func:`MPS_SCAN_BEGIN` and :c:func:`MPS_SCAN_END`.

    .. note::

        If your reference is :term:`tagged <tagged reference>` (or
        otherwise "encrypted"), you must remove the tag (or otherwise
        decrypt the reference) before calling :c:func:`MPS_FIX2`, and
        restore the tag to the (possibly updated) reference
        afterwards.

        The only exception is for references to objects belonging to a
        format with :term:`in-band headers`: the header size must not
        be subtracted from these references.

    .. note::

        In the case where the scan method does not need to do anything
        between :c:func:`MPS_FIX1` and :c:func:`MPS_FIX2`, you can use
        the convenience macro :c:func:`MPS_FIX12`.


.. c:function:: mps_res_t mps_fix(mps_ss_t ss, mps_addr_t *ref_io)

    .. deprecated:: starting with version 1.111.

        Use :c:func:`MPS_FIX1` and :c:func:`MPS_FIX2` instead.

    :term:`Fix` a :term:`reference`.

    This is a function equivalent to::

        MPS_SCAN_BEGIN(ss);
        MPS_FIX12(ss, ref_io);
        MPS_SCAN_END(ss);

    Because :term:`scanning <scan>` is an operation on the
    :term:`critical path`, we recommend that you use
    :c:func:`MPS_FIX12` (or :c:func:`MPS_FIX1` and :c:func:`MPS_FIX2`)
    to ensure that the "stage 1 fix" is inlined.

    .. note::

        If you call this between :c:func:`MPS_SCAN_BEGIN` and
        :c:func:`MPS_SCAN_END`, you must use :c:func:`MPS_FIX_CALL` to
        ensure that the scan state is passed correctly.