.. mode: -*- rst -*- Thread manager ============== :Tag: design.mps.thread-manager :Author: Richard Brooksby :Date: 1995-11-20 :Status: incomplete design :Revision: $Id: //info.ravenbrook.com/project/mps/save-errno-win32/design/thread-manager.txt#2 $ :Copyright: See `Copyright and License`_. :Index terms: pair: thread manager; design Introduction ------------ _`.intro`: This is the design of the thread manager module. _`.readership`: Any MPS developer; anyone porting the MPS to a new platform. _`.overview`: The thread manager implements two features that allow the MPS to work in a multi-threaded environment: exclusive access to memory, and scanning of roots in a thread's registers and control stack. Requirements ------------ _`.req.exclusive`: The thread manager must provide the MPS with exclusive access to the memory it manages in critical sections of the code. (This is necessary to avoid for the MPS to be able to flip atomically from the point of view of the mutator.) _`.req.scan`: The thread manager must be able to locate references in the registers and control stack of the current thread, or of a suspended thread. (This is necessary in order to implement conservative collection, in environments where the registers and control stack contain ambiguous roots. Scanning of roots is carried out during the flip, hence while other threads are suspended.) _`.req.register.multi`: It must be possible to register the same thread multiple times. (This is needed to support the situation where a program that does not use the MPS is calling into MPS-using code from multiple threads. On entry to the MPS-using code, the thread can be registered, but it may not be possible to ensure that the thread is deregistered on exit, because control may be transferred by some non-local mechanism such as an exception or ``longjmp()``. We don't want to insist that the client program keep a table of threads it has registered, because maintaining the table might require allocation, which might provoke a collection. See request.dylan.160252_.) .. _request.dylan.160252: https://info.ravenbrook.com/project/mps/import/2001-11-05/mmprevol/request/dylan/160252/ _`.req.thread.die`: It would be nice if the MPS coped with threads that die while registered. (This makes it easier for a client program to interface with foreign code that terminates threads without the client program being given an opportunity to deregister them. See request.dylan.160022_ and request.mps.160093_.) .. _request.dylan.160022: https://info.ravenbrook.com/project/mps/import/2001-11-05/mmprevol/request/dylan/160022 .. _request.mps.160093: https://info.ravenbrook.com/project/mps/import/2001-11-05/mmprevol/request/mps/160093/ _`.req.thread.intr`: It would be nice if on POSIX systems the MPS does not cause system calls in the mutator to fail with EINTR due to the MPS thread-management signals being delivered while the mutator is blocked in a system call. (See `GitHub issue #9`_.) .. _GitHub issue #9: https://github.com/ravenbrook/mps/issues/9 _`.req.thread.errno`: It would be nice if on POSIX systems the MPS does not cause system calls in the mutator to update ``errno`` due to the MPS thread-management signals being delivered while the mutator is blocked in a system call, and the MPS signal handlers updating ``errno``. (See `GitHub issue #10`_.) .. _GitHub issue #10: https://github.com/ravenbrook/mps/issues/10 _`.req.thread.lasterror`: It would be nice if on Windows systems the MPS does not cause system calls in the mutator to update the value returned from ``GetLastError()`` when the exception handler is called due to a fault. This may cause the MPS to destroy the previous value there. (See `GitHub issue #61`_.) .. _GitHub issue #61: https://github.com/Ravenbrook/mps/issues/61 Design ------ _`.sol.exclusive`: In order to meet `.req.exclusive`_, the arena maintains a ring of threads (in ``arena->threadRing``) that have been registered by the client program. When the MPS needs exclusive access to memory, it suspends all the threads in the ring except for the currently running thread. When the MPS no longer needs exclusive access to memory, it resumes all threads in the ring. _`.sol.exclusive.assumption`: This relies on the assumption that any thread that might refer to, read from, or write to memory in automatically managed pool classes is registered with the MPS. This is documented in the manual under ``mps_thread_reg()``. _`.sol.thread.term`: The thread manager cannot reliably detect that a thread has terminated. The reason is that threading systems do not guarantee behaviour in this case. For example, POSIX_ says, "A conforming implementation is free to reuse a thread ID after its lifetime has ended. If an application attempts to use a thread ID whose lifetime has ended, the behavior is undefined." For this reason, the documentation for ``mps_thread_reg()`` specifies that it is an error if a thread dies while registered. .. _POSIX: https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_02 _`.sol.thread.term.attempt`: Nonetheless, the thread manager makes a "best effort" to continue running after detecting a terminated thread, by moving the thread to a ring of dead threads, and avoiding scanning it. This might allow a malfunctioning client program to limp along. _`.sol.thread.intr`: The POSIX specification for sigaction_ says that if the ``SA_RESTART`` flag is set, and if "a function specified as interruptible is interrupted by this signal, the function shall restart and shall not fail with ``EINTR`` unless otherwise specified." .. |sigaction| replace:: ``sigaction()`` .. _sigaction: https://pubs.opengroup.org/onlinepubs/9699919799/functions/sigaction.html _`.sol.thread.intr.linux`: Linux does not fully implement the POSIX specification, so that some system calls are "never restarted after being interrupted by a signal handler, regardless of the use of SA_RESTART; they always fail with the error EINTR when interrupted by a signal handler". The exceptional calls are listed in the |signal|_ manual. There is nothing that the MPS can do about this except to warn users in the reference manual. .. |signal| replace:: signal(7) .. _signal: https://man7.org/linux/man-pages/man7/signal.7.html _`.sol.thread.errno`: The POSIX specification for sigaction_ says, "Note in particular that even the "safe" functions may modify ``errno``; the signal-catching function, if not executing as an independent thread, should save and restore its value." All MPS signals handlers therefore save and restore ``errno`` using the macros ``ERRNO_SAVE`` and ``ERRNO_RESTORE``. _`.sol.thread.lasterror`: The documentation for ``AddVectoredExceptionHandler`` does not mention ``GetLastError()`` at all, but testing_ the behaviour reveals that any value in ``GetLastError()`` is not preserved. Therefore, this value is saved using ``LAST_ERROR_SAVE`` and ``LAST_ERROR_RESTORE``. .. _testing: https://github.com/Ravenbrook/mps/issues/61 Interface --------- ``typedef struct mps_thr_s *Thread`` _`.if.thread`: The type of threads. It is a pointer to an opaque structure, which must be defined by the implementation. ``Bool ThreadCheck(Thread thread)`` _`.if.check`: The check function for threads. See design.mps.check_. .. _design.mps.check: check ``Bool ThreadCheckSimple(Thread thread)`` _`.if.check.simple`: A thread-safe check function for threads, for use by ``mps_thread_dereg()``. It can't use ``AVER(TESTT(Thread, thread))``, as recommended by design.mps.sig.check.arg.unlocked_, since ``Thread`` is an opaque type. .. _design.mps.sig.check.arg.unlocked: sig#.check.arg.unlocked ``Arena ThreadArena(Thread thread)`` _`.if.arena`: Return the arena that the thread is registered with. Must be thread-safe as it needs to be called by ``mps_thread_dereg()`` before taking the arena lock. ``Res ThreadRegister(Thread *threadReturn, Arena arena)`` _`.if.register`: Register the current thread with the arena, allocating a new ``Thread`` object. If successful, update ``*threadReturn`` to point to the new thread and return ``ResOK``. Otherwise, return a result code indicating the cause of the error. ``void ThreadDeregister(Thread thread, Arena arena)`` _`.if.deregister`: Remove ``thread`` from the list of threads managed by the arena and free it. ``void ThreadRingSuspend(Ring threadRing, Ring deadRing)`` _`.if.ring.suspend`: Suspend all the threads on ``threadRing``, except for the current thread. If any threads are discovered to have terminated, move them to ``deadRing``. ``void ThreadRingResume(Ring threadRing, Ring deadRing)`` _`.if.ring.resume`: Resume all the threads on ``threadRing``. If any threads are discovered to have terminated, move them to ``deadRing``. ``Thread ThreadRingThread(Ring threadRing)`` _`.if.ring.thread`: Return the thread that owns the given element of the thread ring. ``Res ThreadScan(ScanState ss, Thread thread, Word *stackCold, mps_area_scan_t scan_area, void *closure)`` _`.if.scan`: Scan the stacks and root registers of ``thread``, using ``ss`` and ``scan_area``. ``stackCold`` points to the cold end of the thread's stack---this is the value that was supplied by the client program when it called ``mps_root_create_thread()``. In the common case, where the stack grows downwards, ``stackCold`` is the highest stack address. Return ``ResOK`` if successful, another result code otherwise. Implementations --------------- Generic implementation ...................... _`.impl.an`: In ``than.c``. _`.impl.an.single`: Supports a single thread. (This cannot be enforced because of `.req.register.multi`_.) _`.impl.an.register.multi`: There is no need for any special treatment of multiple threads, because ``ThreadRingSuspend()`` and ``ThreadRingResume()`` do nothing. _`.impl.an.suspend`: ``ThreadRingSuspend()`` does nothing because there are no other threads. _`.impl.an.resume`: ``ThreadRingResume()`` does nothing because no threads are ever suspended. _`.impl.an.scan`: Just calls ``StackScan()`` since there are no suspended threads. POSIX threads implementation ............................ _`.impl.ix`: In ``thix.c`` and ``pthrdext.c``. See design.mps.pthreadext_. .. _design.mps.pthreadext: pthreadext _`.impl.ix.multi`: Supports multiple threads. _`.impl.ix.register`: ``ThreadRegister()`` records the thread id the current thread by calling |pthread_self|_. .. |pthread_self| replace:: ``pthread_self()`` .. _pthread_self: https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_self.html _`.impl.ix.register.multi`: Multiply-registered threads are handled specially by the POSIX thread extensions. See design.mps.pthreadext.req.suspend.multiple_ and design.mps.pthreadext.req.resume.multiple_. .. _design.mps.pthreadext.req.suspend.multiple: pthreadext#.req.suspend.multiple .. _design.mps.pthreadext.req.resume.multiple: pthreadext#.req.resume.multiple _`.impl.ix.suspend`: ``ThreadRingSuspend()`` calls ``PThreadextSuspend()``. See design.mps.pthreadext.if.suspend_. .. _design.mps.pthreadext.if.suspend: pthreadext#.if.suspend _`.impl.ix.resume`: ``ThreadRingResume()`` calls ``PThreadextResume()``. See design.mps.pthreadext.if.resume_. .. _design.mps.pthreadext.if.resume: pthreadext#.if.resume _`.impl.ix.scan.current`: ``ThreadScan()`` calls ``StackScan()`` if the thread is current. _`.impl.ix.scan.suspended`: ``PThreadextSuspend()`` records the context of each suspended thread, and ``ThreadRingSuspend()`` stores this in the ``Thread`` structure, so that is available by the time ``ThreadScan()`` is called. Windows implementation ...................... _`.impl.w3`: In ``thw3.c``. _`.impl.w3.multi`: Supports multiple threads. _`.impl.w3.register`: ``ThreadRegister()`` records the following information for the current thread: - A ``HANDLE`` to the process, with access flags ``THREAD_SUSPEND_RESUME`` and ``THREAD_GET_CONTEXT``. This handle is needed as parameter to |SuspendThread|_ and |ResumeThread|_. - The result of |GetCurrentThreadId|_, so that the current thread may be identified in the ring of threads. .. |SuspendThread| replace:: ``SuspendThread()`` .. _SuspendThread: https://docs.microsoft.com/en-gb/windows/desktop/api/processthreadsapi/nf-processthreadsapi-suspendthread .. |ResumeThread| replace:: ``ResumeThread()`` .. _ResumeThread: https://docs.microsoft.com/en-gb/windows/desktop/api/processthreadsapi/nf-processthreadsapi-resumethread .. |GetCurrentThreadId| replace:: ``GetCurrentThreadId()`` .. _GetCurrentThreadId: https://docs.microsoft.com/en-gb/windows/desktop/api/processthreadsapi/nf-processthreadsapi-getcurrentthreadid _`.impl.w3.register.multi`: There is no need for any special treatment of multiple threads, because Windows maintains a suspend count that is incremented on |SuspendThread|_ and decremented on |ResumeThread|_. _`.impl.w3.suspend`: ``ThreadRingSuspend()`` calls |SuspendThread|_. _`.impl.w3.resume`: ``ThreadRingResume()`` calls |ResumeThread|_. _`.impl.w3.scan.current`: ``ThreadScan()`` calls ``StackScan()`` if the thread is current. This is because |GetThreadContext|_ doesn't work on the current thread: the context would not necessarily have the values which were in the saved registers on entry to the MPS. .. |GetThreadContext| replace:: ``GetThreadContext()`` .. _GetThreadContext: https://docs.microsoft.com/en-us/windows/desktop/api/processthreadsapi/nf-processthreadsapi-getthreadcontext _`.impl.w3.scan.suspended`: Otherwise, ``ThreadScan()`` calls |GetThreadContext|_ to get the root registers and the stack pointer. macOS implementation .................... _`.impl.xc`: In ``thxc.c``. _`.impl.xc.multi`: Supports multiple threads. _`.impl.xc.register`: ``ThreadRegister()`` records the Mach port of the current thread by calling |mach_thread_self|_. .. |mach_thread_self| replace:: ``mach_thread_self()`` .. _mach_thread_self: https://www.gnu.org/software/hurd/gnumach-doc/Thread-Information.html _`.impl.xc.register.multi`: There is no need for any special treatment of multiple threads, because Mach maintains a suspend count that is incremented on |thread_suspend|_ and decremented on |thread_resume|_. .. |thread_suspend| replace:: ``thread_suspend()`` .. _thread_suspend: https://www.gnu.org/software/hurd/gnumach-doc/Thread-Execution.html .. |thread_resume| replace:: ``thread_resume()`` .. _thread_resume: https://www.gnu.org/software/hurd/gnumach-doc/Thread-Execution.html _`.impl.xc.suspend`: ``ThreadRingSuspend()`` calls |thread_suspend|_. _`.impl.xc.resume`: ``ThreadRingResume()`` calls |thread_resume|_. _`.impl.xc.scan.current`: ``ThreadScan()`` calls ``StackScan()`` if the thread is current. _`.impl.xc.scan.suspended`: Otherwise, ``ThreadScan()`` calls |thread_get_state|_ to get the root registers and the stack pointer. .. |thread_get_state| replace:: ``thread_get_state()`` .. _thread_get_state: https://www.gnu.org/software/hurd/gnumach-doc/Thread-Execution.html Document History ---------------- - 1995-11-20 RB_ Incomplete design. - 2002-06-21 RB_ Converted from MMInfo database design document. - 2013-05-26 GDR_ Converted to reStructuredText. - 2014-10-22 GDR_ Complete design. .. _RB: https://www.ravenbrook.com/consultants/rb/ .. _GDR: https://www.ravenbrook.com/consultants/gdr/ Copyright and License --------------------- Copyright © 2013–2020 `Ravenbrook Limited `_. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.