.. mode: -*- rst -*-
POSIX thread extensions
=======================
:Tag: design.mps.pthreadext
:Author: Tony Mann
:Date: 2000-02-01
:Status: Draft document
:Revision: $Id: //info.ravenbrook.com/project/mps/save-errno-win32/design/pthreadext.txt#1 $
:Copyright: See `Copyright and License`_.
:Index terms: pair: POSIX thread extensions; design
Introduction
------------
_`.readership`: Any MPS developer.
_`.intro`: This is the design of the Pthreads extension module, which
provides some low-level threads support for use by MPS (notably
suspend and resume).
Definitions
-----------
_`.pthreads`: The term "Pthreads" means an implementation of the POSIX
1003.1c-1995 thread standard. (Or the Single UNIX Specification,
Version 2, aka USV2 or UNIX98.)
_`.context`: The "context" of a thread is a (platform-specific)
OS-defined structure which describes the current state of the
registers for that thread.
Requirements
------------
_`.req.suspend`: A means to suspend threads, so that they don't make
any progress.
_`.req.suspend.why`: Needed by the thread manager so that other
threads registered with an arena can be suspended (see
design.mps.thread-manager_). Not directly provided by Pthreads.
.. _design.mps.thread-manager: thread-manager
_`.req.resume`: A means to resume suspended threads, so that they are
able to make progress again. _`.req.resume.why`: Needed by the thread
manager. Not directly provided by Pthreads.
_`.req.suspend.multiple`: Allow a thread to be suspended on behalf of
one arena when it has already been suspended on behalf of one or more
other arenas. _`.req.suspend.multiple.why`: The thread manager
contains no design for cooperation between arenas to prevent this.
_`.req.resume.multiple`: Allow requests to resume a thread on behalf
of each arena which had previously suspended the thread. The thread
must only be resumed when requests from all such arenas have been
received. _`.req.resume.multiple.why`: A thread manager for an arena
must not permit a thread to make progress before it explicitly resumes
the thread.
_`.req.suspend.context`: Must be able to access the context for a
thread when it is suspended.
_`.req.suspend.protection`: Must be able to suspend a thread which is
currently handling a protection fault (i.e., an arena access). Such a
thread might even own an arena lock.
_`.req.legal`: Must use the Pthreads and other POSIX APIs in a legal
manner.
Analysis
--------
_`.analysis.suspend`: Thread suspension is inherently asynchronous. MPS
needs to be able to suspend another thread without prior knowledge of
the code that thread is running. (That is, we can't rely on
cooperation between threads.) The only asynchronous communication
available on POSIX is via signals -- so the suspend and resume
mechanism must ultimately be built from signals.
_`.analysis.signal.safety`: POSIX imposes some restrictions on what a
signal handler function might do when invoked asynchronously (see the
sigaction_ documentation, and search for the string "reentrant"). In
summary, a small number of POSIX functions are defined to be
"async-signal safe", which means they may be invoked without
restriction in signal handlers. All other POSIX functions are
considered to be unsafe. Behaviour is undefined if an unsafe function
is interrupted by a signal and the signal handler then proceeds to
call another unsafe function. See `mail.tony.1999-08-24.15-40`_ and
followups for some further analysis.
.. _mail.tony.1999-08-24.15-40: https://info.ravenbrook.com/project/mps/mail/1999/08/24/15-40/0.txt
.. _sigaction: https://pubs.opengroup.org/onlinepubs/007908799/xsh/sigaction.html
_`.analysis.signal.safety.implication`: Since we can't assume that we
won't attempt to suspend a thread while it is running an unsafe
function, we must limit the use of POSIX functions in the suspend
signal handler to those which are designed to be "async-signal safe".
One of the few such functions related to synchronization is
``sem_post()``.
_`.analysis.signal.example`: An example of how to suspend threads in POSIX
was posted to newsgroup comp.programming.threads in August 1999
[Lau_1999-08-16]_. The code in the post was written by David Butenhof, who
contributed some comments on his implementation [Butenhof_1999-08-16]_
_`.analysis.signal.linux-hack`: In the current implementation of Linux
Pthreads, it would be possible to implement suspend/resume using
``SIGSTOP`` and ``SIGCONT``. This is, however, nonportable and will
probably stop working on Linux at some point.
_`.analysis.component`: There is no known way to meet the requirements
above in a way which cooperates with another component in the system
which also provides its own mechanism to suspend and resume threads.
The best bet for achieving this is to provide the functionality in
shared low-level component which may be used by MPS and other clients.
This will require some discussion with other potential clients and/or
standards bodies.
_`.analysis.component.dylan`: Note that such cooperation is actually a
requirement for Dylan (req.dylan.dc.env.self), though this is not a
problem, since all the Dylan components share the MPS mechanism.
Interface
---------
``typedef PThreadextStruct *PThreadext``
_`.if.pthreadext.abstract`: A thread is represented by the abstract
type ``PThreadext``. A ``PThreadext`` object corresponds directly with
a thread (of type ``pthread_t``). There may be more than one
``PThreadext`` object for the same thread.
_`.if.pthreadext.structure`: The structure definition of
``PThreadext`` (``PThreadextStruct``) is exposed by the interface so
that it may be embedded in a client datastructure (for example,
``ThreadStruct``). This means that all storage management can be left
to the client (which is important because there might be multiple
arenas involved). Clients may not access the fields of a
``PThreadextStruct`` directly.
``void PThreadextInit(PThreadext pthreadext, pthread_t id)``
_`.if.init`: Initializes a ``PThreadext`` object for a thread with the
given ``id``.
``Bool PThreadextCheck(PThreadext pthreadext)``
_`.if.check`: Checks a ``PThreadext`` object for consistency. Note
that this function takes the mutex, so it must not be called with the
mutex held (doing so will probably deadlock the thread).
``Res PThreadextSuspend(PThreadext pthreadext, struct sigcontext **contextReturn)``
_`.if.suspend`: Suspends a ``PThreadext`` object (puts it into a
suspended state). Meets `.req.suspend`_. The object must not already
be in a suspended state. If the function returns ``ResOK``, the
context of the thread is returned in contextReturn, and the
corresponding thread will not make any progress until it is resumed.
``Res PThreadextResume(PThreadext pthreadext)``
_`.if.resume`: Resumes a ``PThreadext`` object. Meets `.req.resume`_.
The object must already be in a suspended state. Puts the object into
a non-suspended state. Permits the corresponding thread to make
progress again, although that might not happen immediately if there is
another suspended ``PThreadext`` object corresponding to the same
thread.
``void PThreadextFinish(PThreadext pthreadext)``
_`.if.finish`: Finishes a PThreadext object.
Implementation
--------------
``typedef struct PThreadextStruct PThreadextStruct``
_`.impl.pthreadext`: The structure definition for a ``PThreadext``
object is::
struct PThreadextStruct {
Sig sig; /* */
pthread_t id; /* Thread ID */
MutatorContext context; /* context if suspended */
RingStruct threadRing; /* ring of suspended threads */
RingStruct idRing; /* duplicate suspensions for id */
};
_`.impl.field.id`: The ``id`` field shows which PThread the object
corresponds to.
_`.impl.field.context`: The ``context`` field contains the context
when in a suspended state. Otherwise it is ``NULL``.
_`.impl.field.threadring`: The ``threadRing`` field is used to chain
the object onto the suspend ring when it is in the suspended state
(see `.impl.global.suspend-ring`_). When not in a suspended state,
this ring is single.
_`.impl.field.idring`: The ``idRing`` field is used to group the
object with other objects corresponding to the same thread (same
``id`` field) when they are in the suspended state. When not in a
suspended state, or when this is the only ``PThreadext`` object with
this ``id`` in the suspended state, this ring is single.
_`.impl.global.suspend-ring`: The module maintains a global varaible
``suspendedRing``, a ring of ``PThreadext`` objects which are in a
suspended state. This is primarily so that it's possible to determine
whether a thread is curently suspended anyway because of another
``PThreadext`` object, when a suspend attempt is made.
_`.impl.global.victim`: The module maintains a global variable
``suspendingVictim`` which is used to indicate which ``PThreadext`` is
the current victim during suspend operations. This is used to
communicate information between the controlling thread and the thread
being suspended (the victim). The variable has value ``NULL`` at other
times.
_`.impl.static.mutex`: We use a lock (mutex) around the suspend and
resume operations. This protects the state data (the suspend-ring and
the victim: see `.impl.global.suspend-ring`_ and
`.impl.global.victim`_ respectively). Since only one thread can be
suspended at a time, there's no possibility of two arenas suspending
each other by concurrently suspending each other's threads.
_`.impl.static.semaphore`: We use a semaphore to synchronize between
the controlling and victim threads during the suspend operation. See
`.impl.suspend`_ and `.impl.suspend-handler`_).
_`.impl.static.init`: The static data and global variables of the
module are initialized on the first call to ``PThreadextSuspend()``,
using ``pthread_once()`` to avoid concurrency problems. We also enable
the signal handlers at the same time (see `.impl.suspend-handler`_ and
`.impl.resume-handler`_).
_`.impl.suspend`: ``PThreadextSuspend()`` first ensures the module is
initialized (see `.impl.static.init`_). After this, it claims the
mutex (see `.impl.static.mutex`_). It then checks to see whether
thread of the target ``PThreadext`` object has already been suspended
on behalf of another ``PThreadext`` object. It does this by iterating
over the suspend ring.
_`.impl.suspend.already-suspended`: If another object with the same id
is found on the suspend ring, then the thread is already suspended.
The context of the target object is updated from the other object, and
the other object is linked into the ``idRing`` of the target.
_`.impl.suspend.not-suspended`: If the thread is not already
suspended, then we forcibly suspend it using a technique similar to
Butenhof's (see `.analysis.signal.example`_): First we set the victim
variable (see `.impl.global.victim`_) to indicate the target object.
Then we send the signal ``PTHREADEXT_SIGSUSPEND`` to the thread (see
`.impl.signals`_), and wait on the semaphore for it to indicate that
it has received the signal and updated the victim variable with the
context. If either of these operations fail (for example, because of
thread termination) we unlock the mutex and return ``ResFAIL``.
_`.impl.suspend.update`: Once we have ensured that the thread is
definitely suspended, we add the target ``PThreadext`` object to the
suspend ring, unlock the mutex, and return the context to the caller.
_`.impl.suspend-handler`: The suspend signal handler is invoked in the
target thread during a suspend operation, when a
``PTHREADEXT_SIGSUSPEND`` signal is sent by the controlling thread
(see `.impl.suspend.not-suspended`_). The handler determines the
context (received as a parameter, although this may be
platform-specific) and stores this in the victim object (see
`.impl.global.victim`_). The handler then masks out all signals except
the one that will be received on a resume operation
(``PTHREADEXT_SIGRESUME``) and synchronizes with the controlling
thread by posting the semaphore. Finally the handler suspends until
the resume signal is received, using ``sigsuspend()``.
_`.impl.resume`: ``PThreadextResume()`` first claims the mutex (see
`.impl.static.mutex`_). It then checks to see whether thread of the
target ``PThreadext`` object has also been suspended on behalf of
another ``PThreadext`` object (in which case the id ring of the target
object will not be single).
_`.impl.resume.also-suspended`: If the thread is also suspended on
behalf of another ``PThreadext``, then the target object is removed from
the id ring.
_`.impl.resume.not-also`: If the thread is not also suspended on
behalf of another ``PThreadext``, then the thread is resumed using the
technique proposed by Butenhof (see `.analysis.signal.example`_). I.e. we
send it the signal ``PTHREADEXT_SIGRESUME`` (see `.impl.signals`_) and
expect it to wake up. If this operation fails (for example, because of
thread termination) we unlock the mutex and return ``ResFAIL``.
_`.impl.resume.update`: Once the target thread is in the appropriate
state, we remove the target ``PThreadext`` object from the suspend
ring, set its context to ``NULL`` and unlock the mutex.
_`.impl.resume-handler`: The resume signal handler is invoked in the
target thread during a resume operation, when a
``PTHREADEXT_SIGRESUME`` signal is sent by the controlling thread (see
`.impl.resume.not-also`_). The resume signal handler simply returns.
This is sufficient to unblock the suspend handler, which will have
been blocking the thread at the time of the signal. The Pthreads
implementation ensures that the signal mask is restored to the value
it had before the signal handler was invoked.
_`.impl.finish`: ``PThreadextFinish()`` supports the finishing of
objects in the suspended state, and removes them from the suspend ring
and id ring as necessary. It must claim the mutex for the removal
operation (to ensure atomicity of the operation). Finishing of
suspended objects is supported so that clients can dispose of
resources if a resume operation fails (which probably means that the
PThread has terminated).
_`.impl.signals`: The choice of which signals to use for suspend and
restore operations may need to be platform-specific. Some signals are
likely to be generated and/or handled by other parts of the
application and so should not be used (for example, ``SIGSEGV``). Some
implementations of PThreads use some signals for themselves, so they
may not be used; for example, LinuxThreads uses ``SIGUSR1`` and
``SIGUSR2`` for its own purposes, and so do popular tools like
Valgrind that we would like to be compatible with the MPS. The design
therefore abstractly names the signals ``PTHREADEXT_SIGSUSPEND`` and
``PTHREAD_SIGRESUME``, so that they may be easily mapped to
appropriate real signal values. Candidate choices are ``SIGXFSZ`` and
``SIGXCPU``.
_`.impl.signals.config`: The identity of the signals used to suspend
and resume threads can be configured at compilation time using the
preprocessor constants ``CONFIG_PTHREADEXT_SIGSUSPEND`` and
``CONFIG_PTHREADEXT_SIGRESUME`` respectively.
Attachments
-----------
[missing attachment "posix.txt"]
[missing attachment "susp.c"]
References
----------
.. [Butenhof_1999-08-16]
"Re: Problem with Suspend & Resume Thread Example";
Dave Butenhof; comp.programming.threads; 1999-08-16;
.
.. [Lau_1999-08-16]
"Problem with Suspend & Resume Thread Example";
Raymond Lau; comp.programming.threads; 1999-08-16;
.
Document History
----------------
- 2000-02-01 Tony Mann. Draft document.
- 2002-06-07 RB_ Converted from MMInfo database design document.
- 2013-05-23 GDR_ Converted to reStructuredText.
.. _RB: https://www.ravenbrook.com/consultants/rb/
.. _GDR: https://www.ravenbrook.com/consultants/gdr/
Copyright and License
---------------------
Copyright © 2013–2020 `Ravenbrook Limited `_.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.