www.ravenbrook.com - message-gc.rst

.. index::
   pair: garbage collection messages; design

.. _design-message-gc:


GC messages
===========

.. mps:prefix:: design.mps.config


Introduction
------------

:mps:tag:`intro` This document describes the design of the MPS garbage
collection messages. For a guide to the MPS message system in general,
see design.mps.message.

:mps:tag:`readership` Any MPS developer.


Overview
--------

The MPS provides two types of GC messages:

- :c:func:`mps_message_type_gc_start()`;
- :c:func:`mps_message_type_gc()`.

They are called "trace start" and "trace end" messages in this
document and in most MPS source code.


Introduction
------------

The MPS posts a trace start message (:c:func:`mps_message_type_gc_start()`)
near the start of every trace (but after calculating the condemned
set, so we can report how large it is).

The MPS posts a trace end message (:c:func:`mps_message_type_gc()`) near the
end of every trace.

These messages are extremely flexible: they can hold arbitrary
additional data simply by writing new accessor functions. If there is
more data to report at either of these two events, then there is a
good argument for adding it into these existing messages.

.. note::

    In previous versions of this design document, there was a partial
    unimplemented design for an :c:func:`mps_message_type_gc_generation()`
    message. This would not have been a good design, because managing
    and collating multiple messages is much more complex for both MPS
    and client than using a single message. Richard Kistruck,
    2008-12-19.


Purpose
-------

:mps:tag:`purpose` The purpose of these messages is to allow the client
program to be aware of GC activity, in order to:

- adjust its own behaviour programmatically;
- show or report GC activity in a custom way, such as an in-client
  display, in a log file, etc.

The main message content should be intelligible and helpful to
client-developers (with help from MPS staff if necessary). There may
be extra content that is only meaningful to MPS staff, to help us
diagnose client problems.

While there is some overlap with the Diagnostic Feedback system
(design.mps.diag) and the Telemetry system (design.mps.telemetry), the
main contrasts are that these GC messages are present in release
builds, are stable from release to release, and are designed to be
parsed by the client program.


Names and parts
---------------

Here's a helpful list of the names used in the GC message system:

Implementation is mostly in the source file ``traceanc.c`` (trace
ancillary).

============================= ============================== ======================
Internal name                 "trace start"                  "trace end"
Internal type                 ``TraceStartMessage``          ``TraceMessage``
:c:type:`ArenaStruct` member        ``tsMessage[]``                ``tMessage``
Message type                  ``MessageTypeGCSTART``         ``MessageTypeGC``
External name                 ``mps_message_type_gc_start``  ``mps_message_type_gc``
============================= ============================== ======================

.. note::

    The names of these messages are unconventional; they should
    properly be called "gc (or trace) *begin*" and "gc (or trace)
    *end*". But it's much too late to change them now. Richard
    Kistruck, 2008-12-15.

Collectively, the trace-start and trace-end messages are called the
"trace id messages", and they are managed by the functions
:c:func:`TraceIdMessagesCheck()`, :c:func:`TraceIdMessagesCreate()`, and :c:func:`TraceIdMessagesDestroy()`.

The currently supported message-field accessor methods are:
:c:func:`mps_message_gc_start_why()`, :c:func:`mps_message_gc_live_size()`,
:c:func:`mps_message_gc_condemned_size()`, and
:c:func:`mps_message_gc_not_condemned_size()`. These are documented in the
Reference Manual.


Lifecycle
---------

:mps:tag:`lifecycle` for each trace id, pre-allocate a pair of start/end
messages by calling :c:func:`ControlAlloc()`. Then, when a trace runs using
that trace id, fill in and post these messages. As soon as the trace
has posted both messages, immediately pre-allocate a new pair of
messages, which wait in readiness for the next trace to use that
trace id.


Requirements
............

:mps:tag:`req.no-start-alloc` Should avoid attempting to allocate memory at
trace start time. :mps:tag:`req.no-start-alloc.why` There may be no free
memory at trace start time. Client would still like to hear about
collections in those circumstances.

:mps:tag:`req.queue` Must support a client that enables, but does not
promptly retrieve, GC messages. Messages that have not yet been
retrived must remain queued, and the client must be able to retrieve
them later without loss. It is not acceptable to stop issuing GC
messages for subsequent collections merely because messages from
previous collections have not yet been retrieved. :mps:tag:`req.queue.why`
This is because there is simply no reasonable way for a client to
guarantee that it always promptly collects GC messages.

:mps:tag:`req.match` Start and end messages should always match up: never
post one of the messages but fail to post the matching one.

:mps:tag:`req.match.why` This makes client code much simpler -- it does not
have to handle mismatching messages.

:mps:tag:`req.errors-not-direct` Errors (such as a :c:func:`ControlAlloc()`
failure) cannot be reported directly to the client, because
collections often happen automatically, without an explicit client
call to the MPS interface.

:mps:tag:`req.multi-trace` Up to ``TraceLIMIT`` traces may be running, and
emitting start/end messages, simultaneously.

:mps:tag:`req.early` Nice to tell client as much as possible about the
collection in the start message, if we can.

:mps:tag:`req.similar` Start and end messages are conceptually similar -- it
is quite okay, and may be helpful to the client, for the same datum
(for example: the reason why the collection occurred) to be present in
both the start and end message.


Storage
.......

For each trace-id (:mps:ref:`.req.multi-trace`) a pair (:mps:ref:`.req.match`) of
start/end messages is dynamically allocated (:mps:ref:`.req.queue`) in advance
(:mps:ref:`.req.no-start-alloc`). Messages are allocated in the control pool
using :c:func:`ControlAlloc()`.

.. note::

    Previous implementations of the trace start message used static
    allocation. This does not satisfy :mps:ref:`.req.queue`. See also
    job001570_. Richard Kistruck, 2008-12-15.

    .. _job001570: http://www.ravenbrook.com/project/mps/issue/job001570/

Poiters to these messages are stored in ``tsMessage[ti]`` and
``tMessage[ti]`` arrays in the :c:type:`ArenaStruct`.

.. note::

    We must not> keep the pre-allocated messages, or pointers to them,
    in :c:type:`TraceStruct`: the memory for these structures is statically
    allocated, but the values in them are re-initialised by
    :c:func:`TraceCreate()` each time the trace id is used, so the
    :c:func:`TraceStruct()` is invalid (that is: to be treated as random
    uninitialised memory) when not being used by a trace. See also
    job001989_. Richard Kistruck, 2008-12-15.

    .. _job001989: http://www.ravenbrook.com/project/mps/issue/job001989/


Creating and Posting
....................

In :c:func:`ArenaCreate()` we use :c:macro:`TRACE_SET_ITER` to initialise the
``tsMessage[ti]`` and ``tMessage[ti]`` pointers to :c:macro:`NULL`, and then
(when the control pool is ready) :c:macro:`TRACE_SET_ITER` calling
:c:func:`TraceIdMessagesCreate()`. This performs the initial pre-allocation
of the trace start/end messages for each trace id. Allocation failure
is not tolerated here: it makes :c:func:`ArenaCreate()` fail with an error
code, because the arena is deemed to be unreasonably small.

When a trace is running using trace id ``ti``, it finds a
pre-allocated message via ``tsMessage[ti]`` or ``tMessage[ti]`` in the
:c:func:`ArenaStruct()`, fills in and posts the message, and nulls-out the
pointer. (If the pointer was null, no message is sent; see below.) The
message is now reachable only from the arena message queue (but the
control pool also knows about it).

When the trace completes, it calls :c:func:`TraceIdMessagesCreate()` for its
trace id. This performs the ongoing pre-allocation of the trace
start/end messages for the next use of this trace id. The expectation
is that, after a trace has completed, some memory will have been
reclaimed, and the :c:func:`ControlAlloc()` will succeed.

But allocation failure here is permitted: if it happens, both the
start and the end messages are freed (if present). This means that,
for the next collection using this trace id, neither a start nor an
end message will be sent (:mps:ref:`.req.match`). There is no direct way to
report this failure to the client (:mps:ref:`.req.errors-not-direct`), so we
just increment the ``droppedMessages`` counter in the :c:type:`ArenaStruct`.
This counter is available via the ``MessagesDropped`` telemetry event.


Getting and discarding
......................

If the client has not enabled that message type, the message is
discarded immediately when posted, calling :c:func:`ControlFree()` and
reclaiming the memory.

If the client has enabled but never gets the message, it remains on
the message queue until :c:func:`ArenaDestroy()`. Theoretically these
messages could accumulate forever until they exhaust memory. This is
intentional: the client should not enable a message type and then
never get it!

Otherwise, when the client gets a message, it is dropped from the
arena message queue: now only the client (and the control pool) hold
references to it. The client must call :c:func:`mps_message_discard()` once
it has finished using the message. This calls :c:func:`ControlFree()` and
reclaims the memory.

If the client simply drops its reference, the memory will not be
reclaimed until :c:func:`ArenaDestroy()`. This is intentional: the control
pool is not garbage-collected.


Final clearup
.............

Final clearup is performed at :c:func:`ArenaDestroy()`, as follows:

- Unused and unsent pre-allocated messages (one per trace id) are
  freed with :c:macro:`TRACE_SET_ITER` calling :c:func:`TraceIdMessagesDestroy()`
  which calls the message Delete functions (and thereby
  :c:func:`ControlFree()`) on anything left in ``tsMessage[ti]`` and
  ``tMessage[ti]``.

- Unretrieved messages are freed by emptying the arena message queue
  with :c:func:`MessageEmpty()`.

- Retrieved but undiscarded messages are freed by destroying the
  control pool.


Testing
-------

The main test is "``zmess.c``". See notes there.

Various other tests, including ``amcss.c``, also collect and report
:c:func:`mps_message_type_gc()` and :c:func:`mps_message_type_gc_start()`.


Coverage
........

Current tests do not check:

- The less common why-codes (reasons why a trace starts). These should
  be added to ``zmess.c``.