26. Stack probe¶
26.1. Introduction¶
.intro: This is the design of the stack probe module.
.readership: Any MPS developer; anyone porting the MPS to a new platform.
.overview: This module ensures that the stack cannot overflow while the MPS is holding a lock, so that a mutator can handle stack overflow faults and call into the MPS from the handler.
26.2. Requirements¶
.req.overflow: The mutator should be able to call into the MPS from a stack overflow fault handler. (This is a convenient way to handle stack overflows in dynamic language implementations: if the stack overflow exception and associated backtrace are to be represented as objects, this may require allocation, and hence a call into the MPS.)
.req.complete: In an application where the mutator might call into the MPS from a stack overflow fault handler, then whenever the MPS takes a lock, it must complete the operation and release the lock without running out of stack. (This is because running out of stack would cause a stack overflow fault, causing the mutator would enter the MPS recursively, which would fail because the lock is held.)
26.3. Design¶
.sol.probe: Before taking the arena lock in ArenaEnterLock()
,
the MPS probes the stack: that is, it checks whether there are at
least StackProbeDEPTH
words available, and provokes a stack
overflow fault if there are not. (This ensures that the fault occurs
outside of the arena lock where it can be handled safely.)
.sol.depth: The configuration parameter StackProbeDEPTH
specifies the maximum number of words of stack that the MPS might use.
(It is simpler, faster, and more reliable, to determine this globally
than to try to figure it out dynamically.)
.sol.depth.constraint: Operating systems typically use a single
“guard page” to detect stack overflow and grow the stack. (See for
example the documentation for Windows.) This means that the probe
will be ineffective if it skips over the guard page into the memory
beyond. If StackProbeDEPTH
is greater than or equal to the number
of words per page, the implementation might need to carry out multiple
probes. (This constraint is checked in MPMCheck()
.)
.sol.depth.no-recursion: In order to implement this design, the MPS must have constant bounded stack depth, and therefore, no recursion.
.sol.depth.analysis: Here’s a table showing a deep call into the
MPS (in the master sources at changelevel 187378), starting in
ArenaAccess()
at the point where the arena ring lock is taken. The
access forces a scan of a segment in an AMC pool, which fixes a
reference to an object in an AMC pool’s oldspace, which has to be
forwarded, and this overflows the forwarding buffer, which requires
the arena to allocate a new buffer in an appropriate zone, by
searching the splay tree representing free memory.
The “Args” column gives the number of arguments to the function (all arguments to functions in the MPS are word-sized or smaller, since we prohibit passing structures by value), and the “Locals” column gives the number of words in local variables. The value “≤64” for the stack usage of the object format’s scan method is the limit that’s documented in the manual.
Args |
Locals |
Function |
---|---|---|
5 |
0 |
|
5 |
0 |
|
3 |
5 |
|
4 |
1 |
|
4 |
8 |
|
4 |
0 |
|
4 |
5 |
|
3 |
≤64 |
|
4 |
15 |
|
4 |
5 |
|
6 |
10 |
|
6 |
9 |
|
7 |
5 |
|
5 |
5 |
|
5 |
5 |
|
5 |
11 |
|
7 |
1 |
|
7 |
16 |
|
5 |
3 |
|
6 |
7 |
|
3 |
7 |
|
4 |
8 |
|
3 |
0 |
|
109 |
≤190 |
Total |
We expect that a compiler will often be able to share stack space between function arguments and local variables, but in the worst case where it cannot, this call requires no more than 299 words of stack space.
This isn’t necessarily the deepest call into the MPS (the MPS’s
modular design and class system makes it hard to do a complete
analysis using call graph tools), but it’s probably close. The value
for StackProbeDEPTH
is thus chosen to be a round number that’s
comfortably larger than this.
26.4. Interface¶
.if.probe: If there are at least depth
words of stack
available, return. If not, provoke a stack overflow fault.
26.5. Issues¶
.issue.an: The generic implementation is non-functional. This means that it is only suitable for use with programs that do not handle stack overflow faults, or do not call into the MPS from the handler. This is because our customers have only required .req.overflow on Windows so far. If this becomes a requirement on other platforms, the following Standard C implementation might work:
void StackProbe(Size depth) {
volatile Word w;
Word *p = &w - depth;
w = *p;
}
The use of volatile
here is to prevent compilers from warning
about the variable w
being written but never read, or worse,
optimizing away the whole statement under the “as if” rule.
26.6. Implementations¶
.impl.an: Generic implementation in span.c
. This implementation
does nothing. See .issue.an.
.impl.w3i3: Implementation for Windows on IA-32 in spw3i3.c
.
This uses assembly to get the stack pointer (from the ESP register)
and to read the location depth
words below the stack pointer.
.impl.w3i6: Implementation for Windows on x86-64 in spw3i6.c
.
This passes the argument depth*sizeof(Word)
to the Windows
function _alloca()
, for which the documentation says, “A stack
overflow exception is generated if the space cannot be allocated.”