Title | Arena contracted callback gets called with invalid arena |
Status | closed |
Priority | essential |
Assigned user | Gareth Rees |
Organization | Ravenbrook |
Description | On platform ananll (really lii6ll), amsss segfaults with seed 1273421766 [1]. Not reproducible with this seed on xci6ll. |
Analysis | gdb shows the following backtrace: #0 0x000000000047b033 in TreeCheck (tree=0x7ffff71fa028) at tree.c:28 #1 0x00000000004065a8 in ArenaCheck (arena=0x6bc000) at arena.c:168 #2 0x000000000040f70d in vmArenaTrivContracted (arena=0x6bc000, base=0x7ffff71fa000, size=1048576) at arenavm.c:486 #3 0x000000000040e5db in vmChunkCompact (tree=0x7ffff71fa028, closureP=0x6bc000, closureS=1375995373) at arenavm.c:1127 #4 0x000000000047c674 in TreeTraverseAndDelete (treeIO=0x6bc3d8, visitor=0x40e3c0 <vmChunkCompact>, closureP=0x6bc000, closureS=1375995373) at tree.c:558 #5 0x000000000040d6fe in VMCompact (arena=0x6bc000, trace=0x6bc698) at arenavm.c:1150 In vmChunkCompact the code looks like this: vmChunkDestroy(tree, UNUSED_POINTER, UNUSED_SIZE); vmArena->contracted(arena, base, size); return TRUE; The call to vmChunkDestroy puts the arena into an invalid state (because the tree node was embedded in the Chunk and so in the VM, which has been destroyed) which is corrected immediately after the return (to TreeTraverseAndDelete). Solution ideas: 1. Call vmArena->contracted before destroying the chunk instead of afterwards. (Simple, robust, and avoids a race condition in which the freed address space gets reused before vmArena->contracted gets called.) 2. Use a message to inform the client program, instead of a callback. (But will this work? The point of the callbacks is so that on Windows, a compiler that is putting executable code into memory managed by the MPS can register this memory with the Microsoft runtime for stack unwinding via RtlInstallFunctionTableCallback [2]. The registration of the chunk needs to be prompt, otherwise if you get an exception in code that's been allocated in a fresh chunk before you get back to the message-handling loop, then Windows won't know how to unwind the stack. Maybe deregistration could be lazy but then the race condition gets more severe.) 3. (Suggested by NB.) Change the VMCompact algorithm so that it does not destroy the chunks inside the TreeTraverseAndDelete. Instead, we could do the operation in two passes: first, remove chunks from the arena's chunkTree (putting them into another structure); second, iterate over the other structure destroying the chunks. (Complicated and error-prone.) |
How found | automated_test |
Evidence | [1] https://travis-ci.org/Ravenbrook/mps-temporary/jobs/37985784 [2] http://msdn.microsoft.com/en-gb/library/windows/desktop/ms680595.aspx |
Test procedure | amsss |
Created by | Gareth Rees |
Created on | 2014-10-14 23:11:55 |
Last modified by | Gareth Rees |
Last modified on | 2014-10-20 23:07:34 |
History | 2014-10-14 GDR Created. 2014-10-20 GDR Analysis, upgrade to essential. |
Change | Effect | Date | User | Description |
---|---|---|---|---|
187294 | closed | 2014-10-20 16:40:32 | Gareth Rees | Call the "contracted" callback before destroying the chunk, as the arena is (briefly) invalid afterwards. |