Title | MPS spends too much time making system calls |
Status | closed |
Priority | essential |
Assigned user | Gareth Rees |
Organization | Ravenbrook |
Description | In the test case shown below, the toy Scheme interpreter is spending nearly 70% of the time in system calls. This leads to unacceptable performance. $ time ./scheme-advanced test-leaf.scm All tests pass. real 1m10.526s user 0m22.468s sys 0m47.605s See [1]. The test case in question is scheme-advanced.c with the following changes: - { 150, 0.85 }, - { 170, 0.45 } + { 6400, 0.85 }, + { 6400, 0.45 } - (size_t)(32 * 1024 * 1024)); + (size_t)(1024ul * 1024 * 1024)); (I haven't checked in these changes because for pedagogic reasons we want the Scheme interpreter to do frequent collections.) |
Analysis | Looking at the system calls with dtruss [2], I can see that the vast majority of them (> 97%) are mprotect: $ sudo dtruss ./scheme-advanced test-leaf.scm 2> dtruss.log All tests pass. $ wc -l dtruss.log 9540925 dtruss.log $ <dtruss.log sed -n 's/^\([A-Za-z][A-Za-z_0-9]*\)(.*$/\1/gp' | sort | uniq -c | sort -rn | head 4 9258063 mprotect 244500 mmap 21837 getrusage 14935 sigreturn In particular I can see long sequences of hundreds of mprotect calls each protecting a 4K page. Also I can see evidence of pointlessly turning barriers off and then on again (or vice versa) before returning from a signal. Possible causes include: 1. The MPS taking too small a time-slice after a barrier hit. The amount of work has to be large enough to justify the overhead from the context switch. [Need to investigate: how big are the time slices? How long does a context switch take?] 2. The shield cache size is too small, so gets flushed prematurely, resulting in page protections that would have been found to be unnecessary with a larger cache. [Need some numbers: how many ShieldExpose/ShieldCover calls are there in the course of a typical quantum of work?] 3. There's no optimization of mprotect calls: even if a range of pages needing protection is contiguous, an mprotect call is issued for each one. [This might be better handled by allocating bigger segments than trying to coalesce mprotect calls.] 4. There are a lot of mmap calls: perhaps the arena is being too aggressive at returning memory to the operating system. [Is this the function of the "spare commit limit"?] 5. The MPS using a very small granularity for its segments: at the moment these are only 4096 bytes. [3] GDR 2014-02-03 For the mmap problem, see also job003674. |
How found | manual_test |
Evidence | [1] <http://info.ravenbrook.com/mail/2012/11/07/16-21-59/0/ >[2] < https://info.ravenbrook.com/mail/2012/11/07/17-09-01/0/ >[3] < https://info.ravenbrook.com/mail/2012/11/08/11-00-48/0/ > |
Observed in | 1.110.0 |
Created by | Gareth Rees |
Created on | 2012-11-07 17:27:42 |
Last modified by | Richard Brooksby |
Last modified on | 2016-03-24 17:29:51 |
History | 2012-11-07 GDR Created. 2014-02-03 GDR Mention job003674. |
Change | Effect | Date | User | Description |
---|---|---|---|---|
190380 | closed | 2016-03-24 12:07:20 | Richard Brooksby | Merging branch/2016-03-13/defer-write-barrier into master sources. |
190376 | closed | 2016-03-24 11:47:37 | Richard Brooksby | Merging branch/2016-03-11/shield-coalesce into master sources. |