www.ravenbrook.com - thread-safety.rst

.. highlight:: none


.. index::
   pair: thread safety; design

.. _design-thread-safety:


Thread safety in the MPS
========================

.. mps:prefix:: design.mps.thread-safety


Introduction
------------

:mps:tag:`intro` This describes how thread safety is achieved in the MPS.

:mps:tag:`overview` The MPS is expected to run in an environment with
multiple threads calling into the MPS. The initial approach is very
simple. Some of the code is known to operate with exclusive access to
the data it manipulates, so this code is safe. For the rest of the
code, shared data structures are locked by the use of a single binary
lock (design.mps.lock_) per arena. This lock is claimed on entry to
the MPS and released on exit from it. So there is at most a single
thread (per arena) running "inside" the MPS at a time.

.. _design.mps.lock: lock.html


Requirements
------------

:mps:tag:`req.threads` Code must work correctly in presence of multiple
threads all calling into the MPS.

:mps:tag:`req.arena` The MPS must safely manage per-arena non-shared data.

:mps:tag:`req.global.mutable` The MPS must safely manage global data that
may be updated many times (that is, the arena ring).

:mps:tag:`req.global.once` The MPS must safely manage global data that is
updated at most once (that is, the protocol classes).

:mps:tag:`req.deadlock` The MPS must not deadlock.

:mps:tag:`req.fork` On Unix platforms, the MPS should be able to continue in
the child process after a :c:func:`fork()`. (Source: job004062_.)

.. _job004062: https://www.ravenbrook.com/project/mps/issue/job004062/

:mps:tag:`req.perf` Performance should not be unreasonably hindered.


Analysis
--------

:mps:tag:`analysis.simple` To have the code functioning correctly it should be
easy to change correctly. So a simple approach is desirable. We have
to also ensure that performance is not unreasonably downgraded.


Performance cost of locking
...........................

:mps:tag:`lock-cost` The cost of locking in performance terms are:

- :mps:tag:`lock-cost.overhead` the overhead of claiming and releasing locks;

- :mps:tag:`lock-cost.pause` the pauses caused by one thread being blocked
  on another thread.

- :mps:tag:`lock-cost.wait` the time wasted by one thread being blocked on
  another thread.

:mps:tag:`analysis.perf.signif` :mps:ref:`.lock-cost.pause` is significant if there are
MPS functions that take a long time. Using more locks, e.g. having a
lock per pool as well as a lock per arena, is a way of decreasing the
locking conflict between threads (:mps:ref:`.lock-cost.pause` and
:mps:ref:`.lock-cost.wait`). However this could increase
:mps:ref:`.lock-cost.overhead` significantly.

:mps:tag:`analysis.perf.work` But all MPS functions imply a small work-load
unless a collection is taking place. In the case of a collection, in
practice and certainly in the near future, all threads will most
likely be suspended while the collection work is going on. (The pages
being scanned will need to be unprotected which implies the mutator
will have to be stopped.) We also have to remember that unless we are
running on genuine multiprocessor :mps:ref:`.lock-cost.wait` is irrelevant.

:mps:tag:`analysis.perf.alloc` During typical use we expect that it is
allocation that is the most frequent activity. Allocation buffers
(design.mps.buffer_) are designed to allow allocation in concurrent
threads without needing a lock. So the most significant time a thread
spends in the MPS will be on a buffer-fill or during a collection. The
next most significant use is likely to be buffer create and deletion,
as a separate buffer will be required for each thread.

.. _design.mps.buffer: buffer.html

:mps:tag:`analysis.perf.lock` So overall the performance cost of locking is, I
estimate, most significantly the overhead of calling the locking
functions. Hence it would be undesirable from a performance point of
view to have more than one lock.


Recursive vs binary locks
.........................

:mps:tag:`analysis.reentrance` The simplest way to lock the code safely is to
define which code runs inside or outside the lock. Calling from the
outside to the inside implies a lock has to be claimed. Returning
means the lock has to be released. Control flow from outside to
outside and from inside to inside needs no locking action. To
implement this a function defined on the external interface needs to
claim the lock on entry and release it on exit. Our code currently
uses some external functions with the lock already held. There are two
ways to implement this:

#. :mps:tag:`recursive` Each external function claims a recursive lock.

   - simple;
   - have to worry about locking depth;
   - extra locking overhead on internal calls of external functions;

#. :mps:tag:`binary` Each external function claims a binary lock. Replace
   each internal call of an external function with a call to a newly
   defined internal one.

   - more code
   - slightly easier to reason about

:mps:tag:`analysis.strategy` It seems that the :mps:ref:`.recursive` strategy is the
easiest to implement first, but could be evolved into a :mps:ref:`.binary`
strategy. (That evolution has now happened. tony 1999-08-31).


Fork safety
...........

In order to support :c:func:`fork()`, we need to solve the following problems:

:mps:tag:`analysis.fork.lock` Any MPS lock might be held by another thread at
the point where :c:func:`fork()` is called. The lock would be protecting the
integrity of some data structure. But in the child the thread holding
the lock no longer exists, and so there is no way to restore the
integrity.

:mps:tag:`analysis.fork.threads` In the child process after a :c:func:`fork()`, there
is only one thread, which is a copy of the thread that called
:c:func:`fork()` in the parent process. All other threads no longer exist.
But the MPS maintains references to these threads, via the
:c:type:`ThreadStruct` object` created by calls to :c:func:`mps_thread_reg()`. If
we try to communicate with these threads it will fail or crash.

:mps:tag:`analysis.fork.exc-thread` On macOS, the MPS handles protection faults
using a dedicated thread. But in the child process after a :c:func:`fork()`,
this dedicated thread no longer exists. Also, the Mach port on which
the dedicated thread receives its messages does not exist in the child
either.

:mps:tag:`analysis.fork.mach-port` On macOS, the MPS identifies threads via
their Mach port numbers, which are stashed in the :c:type:`ThreadStruct` and
used to identify the current thread, for example in
:c:func:`ThreadSuspend()`. But in the child process after :c:func:`fork()` the
running thread has a different Mach port number than it did in the
parent.


Design
------

:mps:tag:`sol.locks` Use MPS locks (design.mps.lock_) to implement the
locking.

.. _design.mps.lock: lock.html

:mps:tag:`sol.arena` Each arena has a binary lock that protects the
non-shared data for that arena. Functions in the public interface fall
into the following categories:

- :mps:tag:`sol.arena.entry` Must be called with the arena lock not held
  (thus, these functions are not callable from format methods and
  other callbacks). Claims arena binary lock on entry, releases it on
  exit. The usual case. For example, :c:func:`mps_arena_park()`.

- :mps:tag:`sol.arena.recursive` May be called with the arena lock held (for
  example, from format methods and other callbacks). Claim arena lock
  recursively on entry, release it on exit. For example,
  :c:func:`mps_addr_fmt()`.

- :mps:tag:`sol.arena.lock-free` May be called at any time and does not
  claim or release any locks, because it is documented as being up to
  the client program to ensure thread safety (for example,
  :c:func:`mps_ld_add()`).

- :mps:tag:`sol.arena.maybe-entry` Must be called with the arena lock not
  held. In the common case, does not claim or release any locks
  (because it is documented as being up to the client program to
  ensure thread safety, as for :mps:ref:`.sol.arena.lock-free`), but may need
  to claim and release the arena binary lock (as for
  :mps:ref:`.sol.arena.entry`). For example, :c:func:`mps_reserve()`,
  :c:func:`mps_commit()`, :c:func:`mps_ap_frame_push()`, and
  :c:func:`mps_ap_frame_pop()`.

:mps:tag:`sol.global.mutable` There is a global binary lock (see
design.mps.lock.req.global.binary_) that protects mutable data shared
between all arenas (that is, the arena ring lock: see
design.mps.arena.static.ring.lock_).

.. _design.mps.lock.req.global.binary: lock.html#design.mps.lock.req.global.binary
.. _design.mps.arena.static.ring.lock: arena.html#design.mps.arena.static.ring.lock

:mps:tag:`sol.global.once` There is a global recursive lock (see
design.mps.lock.req.global.recursive_) that protects static data which
must be initialized at most once (that is, the protocol classes). Each
static data structure is accessed only via an "ensure" function that
claims the global recursive lock, checks to see if the data structure
has been initialized yet, and does so if necessary (see
design.mps.protocol.impl.define-class.lock_).

.. _design.mps.lock.req.global.recursive: lock.html#design.mps.lock.req.global.recursive
.. _design.mps.protocol.impl.define-class.lock: protocol.html#design.mps.protocol.impl.define-class.lock

:mps:tag:`sol.deadlock` A strict ordering is required between the global and
arena locks to prevent deadlock. The binary global lock may not be
claimed while either the arena or recursive global lock is held; the
arena lock may not be claimed while the recursive global lock is held.
Each arena lock is independent of all other arena locks; that is, a
thread may not attempt to claim more than one arena lock at a time.
See design.mps.arena.lock.avoid_.

.. _design.mps.arena.lock.avoid: arena.html#design.mps.arena.lock.avoid

:mps:tag:`sol.check` The MPS interface design requires that a function must
check the signatures on the data structures pointed to by its
parameters (see design.mps.sig.check.arg_). In particular, for
functions in the class :mps:ref:`.sol.arena.entry` it is necessary to check
some data structure signatures before taking the arena lock. The
checking interface provides a :c:func:`TESTT()` macro that checks the
signature in a thread-safe way (see
design.mps.sig.check.arg.unlocked_).

.. _design.mps.sig.check.arg: sig.html#design.mps.sig.check.arg
.. _design.mps.sig.check.arg.unlocked: sig.html#design.mps.sig.check.arg.unlocked


Fork safety
-----------

:mps:tag:`sol.fork.atfork` The MPS solves the fork-safety problems by
calling |pthread_atfork|_ to install handler functions that are
called in the parent process just before fork (the "prepare" handler),
and in the parent and child processes just after fork (the "parent"
and "child" handlers respectively).

.. |pthread_atfork| replace:: :c:func:`pthread_atfork()`
.. _pthread_atfork: https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_atfork.html

:mps:tag:`sol.fork.lock` In the prepare handler, the MPS takes all the
locks: that is, the global locks, and then the arena lock for every
arena. Note that a side-effect of this is that the shield is entered
for each arena. In the parent handler, the MPS releases all the locks.
In the child handler, the MPS would like to release the locks but this
does not work on any supported platform, so instead it reinitializes
them, by calling :c:func:`LockInitGlobal()`.

:mps:tag:`sol.fork.thread` On macOS, in the prepare handler, the MPS
identifies for each arena the current thread, that is, the one calling
:c:func:`fork()` which will survive into the child process, and marks this
thread by setting a flag in the appropriate :c:type:`ThreadStruct`. In the
parent handler, this flag is cleared. On all Unix platforms, in the
child handler, all threads (except for the current thread) are marked
as dead and transferred to the ring of dead threads. (The MPS can't
destroy the thread structures at this point because they are owned by
the client program.)

:mps:tag:`sol.fork.exc-thread` On macOS, in the child handler, the exception
port and dedicated thread are re-created, and the current thread
re-registered with the exception port.

:mps:tag:`sol.fork.mach-port` On macOS, in the child handler, the thread
flagged as forking gets its port number updated.