GC messages

author Richard Kistruck
date 2008-12-19
index terms pair: garbage collection messages; design
revision //info.ravenbrook.com/project/mps/custom/cet/branch/2014-10-26/sc/design/message-gc.txt#1
status incomplete design
tag design.mps.config

Introduction

.intro: This document describes the design of the MPS garbage collection messages. For a guide to the MPS message system in general, see design.mps.message.

.readership: Any MPS developer.

Overview

The MPS provides two types of GC messages:

  • mps_message_type_gc_start();
  • mps_message_type_gc().

They are called "trace start" and "trace end" messages in this document and in most MPS source code.

Introduction

The MPS posts a trace start message (mps_message_type_gc_start()) near the start of every trace (but after calculating the condemned set, so we can report how large it is).

The MPS posts a trace end message (mps_message_type_gc()) near the end of every trace.

These messages are extremely flexible: they can hold arbitrary additional data simply by writing new accessor functions. If there is more data to report at either of these two events, then there is a good argument for adding it into these existing messages.

Note

In previous versions of this design document, there was a partial unimplemented design for an mps_message_type_gc_generation() message. This would not have been a good design, because managing and collating multiple messages is much more complex for both MPS and client than using a single message. Richard Kistruck, 2008-12-19.

Purpose

.purpose: The purpose of these messages is to allow the client program to be aware of GC activity, in order to:

  • adjust its own behaviour programmatically;
  • show or report GC activity in a custom way, such as an in-client display, in a log file, etc.

The main message content should be intelligible and helpful to client-developers (with help from MPS staff if necessary). There may be extra content that is only meaningful to MPS staff, to help us diagnose client problems.

While there is some overlap with the Diagnostic Feedback system (design.mps.diag) and the Telemetry system (design.mps.telemetry), the main contrasts are that these GC messages are present in release builds, are stable from release to release, and are designed to be parsed by the client program.

Names and parts

Here's a helpful list of the names used in the GC message system:

Implementation is mostly in the source file traceanc.c (trace ancillary).

Internal name "trace start" "trace end"
Internal type TraceStartMessage TraceMessage
ArenaStruct member tsMessage[] tMessage
Message type MessageTypeGCSTART MessageTypeGC
External name mps_message_type_gc_start mps_message_type_gc

Note

The names of these messages are unconventional; they should properly be called "gc (or trace) begin" and "gc (or trace) end". But it's much too late to change them now. Richard Kistruck, 2008-12-15.

Collectively, the trace-start and trace-end messages are called the "trace id messages", and they are managed by the functions TraceIdMessagesCheck(), TraceIdMessagesCreate(), and TraceIdMessagesDestroy().

The currently supported message-field accessor methods are: mps_message_gc_start_why(), mps_message_gc_live_size(), mps_message_gc_condemned_size(), and mps_message_gc_not_condemned_size(). These are documented in the Reference Manual.

Lifecycle

.lifecycle: for each trace id, pre-allocate a pair of start/end messages by calling ControlAlloc(). Then, when a trace runs using that trace id, fill in and post these messages. As soon as the trace has posted both messages, immediately pre-allocate a new pair of messages, which wait in readiness for the next trace to use that trace id.

Requirements

.req.no-start-alloc: Should avoid attempting to allocate memory at trace start time. .req.no-start-alloc.why: There may be no free memory at trace start time. Client would still like to hear about collections in those circumstances.

.req.queue: Must support a client that enables, but does not promptly retrieve, GC messages. Messages that have not yet been retrived must remain queued, and the client must be able to retrieve them later without loss. It is not acceptable to stop issuing GC messages for subsequent collections merely because messages from previous collections have not yet been retrieved. .req.queue.why: This is because there is simply no reasonable way for a client to guarantee that it always promptly collects GC messages.

.req.match: Start and end messages should always match up: never post one of the messages but fail to post the matching one.

.req.match.why: This makes client code much simpler -- it does not have to handle mismatching messages.

.req.errors-not-direct: Errors (such as a ControlAlloc() failure) cannot be reported directly to the client, because collections often happen automatically, without an explicit client call to the MPS interface.

.req.multi-trace: Up to TraceLIMIT traces may be running, and emitting start/end messages, simultaneously.

.req.early: Nice to tell client as much as possible about the collection in the start message, if we can.

.req.similar: Start and end messages are conceptually similar -- it is quite okay, and may be helpful to the client, for the same datum (for example: the reason why the collection occurred) to be present in both the start and end message.

Storage

For each trace-id (.req.multi-trace) a pair (.req.match) of start/end messages is dynamically allocated (.req.queue) in advance (.req.no-start-alloc). Messages are allocated in the control pool using ControlAlloc().

Note

Previous implementations of the trace start message used static allocation. This does not satisfy .req.queue. See also job001570. Richard Kistruck, 2008-12-15.

Poiters to these messages are stored in tsMessage[ti] and tMessage[ti] arrays in the ArenaStruct.

Note

We must not> keep the pre-allocated messages, or pointers to them, in TraceStruct: the memory for these structures is statically allocated, but the values in them are re-initialised by TraceCreate() each time the trace id is used, so the TraceStruct() is invalid (that is: to be treated as random uninitialised memory) when not being used by a trace. See also job001989. Richard Kistruck, 2008-12-15.

Creating and Posting

In ArenaCreate() we use TRACE_SET_ITER to initialise the tsMessage[ti] and tMessage[ti] pointers to NULL, and then (when the control pool is ready) TRACE_SET_ITER calling TraceIdMessagesCreate(). This performs the initial pre-allocation of the trace start/end messages for each trace id. Allocation failure is not tolerated here: it makes ArenaCreate() fail with an error code, because the arena is deemed to be unreasonably small.

When a trace is running using trace id ti, it finds a pre-allocated message via tsMessage[ti] or tMessage[ti] in the ArenaStruct(), fills in and posts the message, and nulls-out the pointer. (If the pointer was null, no message is sent; see below.) The message is now reachable only from the arena message queue (but the control pool also knows about it).

When the trace completes, it calls TraceIdMessagesCreate() for its trace id. This performs the ongoing pre-allocation of the trace start/end messages for the next use of this trace id. The expectation is that, after a trace has completed, some memory will have been reclaimed, and the ControlAlloc() will succeed.

But allocation failure here is permitted: if it happens, both the start and the end messages are freed (if present). This means that, for the next collection using this trace id, neither a start nor an end message will be sent (.req.match). There is no direct way to report this failure to the client (.req.errors-not-direct), so we just increment the droppedMessages counter in the ArenaStruct. This counter is available via the MessagesDropped telemetry event.

Getting and discarding

If the client has not enabled that message type, the message is discarded immediately when posted, calling ControlFree() and reclaiming the memory.

If the client has enabled but never gets the message, it remains on the message queue until ArenaDestroy(). Theoretically these messages could accumulate forever until they exhaust memory. This is intentional: the client should not enable a message type and then never get it!

Otherwise, when the client gets a message, it is dropped from the arena message queue: now only the client (and the control pool) hold references to it. The client must call mps_message_discard() once it has finished using the message. This calls ControlFree() and reclaims the memory.

If the client simply drops its reference, the memory will not be reclaimed until ArenaDestroy(). This is intentional: the control pool is not garbage-collected.

Final clearup

Final clearup is performed at ArenaDestroy(), as follows:

  • Unused and unsent pre-allocated messages (one per trace id) are freed with TRACE_SET_ITER calling TraceIdMessagesDestroy() which calls the message Delete functions (and thereby ControlFree()) on anything left in tsMessage[ti] and tMessage[ti].
  • Unretrieved messages are freed by emptying the arena message queue with MessageEmpty().
  • Retrieved but undiscarded messages are freed by destroying the control pool.

Testing

The main test is "zmess.c". See notes there.

Various other tests, including amcss.c, also collect and report mps_message_type_gc() and mps_message_type_gc_start().

Coverage

Current tests do not check:

  • The less common why-codes (reasons why a trace starts). These should be added to zmess.c.

Document History

  • 2003-02-17 David Jones. Created.
  • 2006-12-07 Richard Kistruck. Remove mps_message_type_gc_generation() (not implemented).
  • 2007-02-08 Richard Kistruck. Move historically interesting requirements for mps_message_type_gc_generation() (not implemented) to the end of the doc.
  • 2008-12-19 Richard Kistruck. Complete re-write, for new GC message lifecycle. See job001989.
  • 2013-05-23 GDR Converted to reStructuredText.