Walking formatted objects

author Gareth Rees
date 2020-08-31
index terms pair: walk; design
revision $Id$
status complete design
tag design.mps.walk

Introduction

.intro: This is the design of the formatted objects walk interface. The intended audience is MPS developers.

.source: Based on [GDR_2020-08-30].

Use cases

.case.reload: A language runtime that offers hot reloading of code will need to walk all objects belonging to a class (say) in order to modify the references in the objects so they refer to the updated class definition. [Strömbäck_2020-08-20]

.case.serialize: A language runtime that offers serialization and deserialization of the heap will need to walk all formatted objects in order to identify references to globals (during serialization) and modify references to refer to the new locations of the globals (after deserialization). [GDR_2018-08-30]

Requirements

.req.walk.all: It must be possible for the client program to visit all automatically managed formatted objects using a callback.

.req.walk.assume-format: The callback should not need to switch on the format, as this may be awkward in a program which has modules using different pools with different formats.

.req.walk.examine: It must be possible for the callback to examine other automatically managed memory while walking the objects.

.req.walk.modify: It must be possible for the callback to modify the references in the objects.

.req.walk.overhead: The overhead of calling the callback should be minimized.

.req.walk.perf: The performance of subsequent collections should not be affected.

.req.walk.closure: The callback must have access to arbitrary data from the caller.

.req.walk.maint: The interface should be easy to implement and maintain.

Design

A new public function mps_pool_walk() visits the live formatted objects in an automatically managed pool.

.sol.walk.all: The client program must know which pools it has created so it can call mps_pool_walk() for each pool.

.sol.walk.assume-format: All objects in a pool share the same format, so the callback does not need to switch on the format.

.sol.walk.examine: mps_pool_walk() must only be called when the arena is parked, and so there is no read barrier on any object.

.sol.walk.modify: mps_pool_walk() arranges for write-protection to be removed from each segment while it is being walked and restored afterwards if necessary.

.sol.walk.overhead: The callback is called for contiguous regions of formatted objects (not just for each object) where possible so that the per-object function call overhead is minimized.

.sol.walk.perf: The callback uses the scanning protocol so that every reference is fixed and the summary is maintained.

.sol.walk.closure: mps_pool_walk() takes a closure pointer which is stored in the ScanState and passed to the callback.

.sol.walk.maint: We reuse the scanning protocol and provide a generic implementation that iterates over the ring of segments in the pool. We set up an empty white set in the ScanState so that the MPS_FIX1() test always fails and _mps_fix2() is never called. This avoids any per-pool code to support the interface.

References

[GDR_2018-08-30]"Save/restore draft proposal"; Gareth Rees; 2018-08-30; <https://info.ravenbrook.com/mail/2018/08/30/12-57-09/0/>.
[GDR_2020-08-30](1, 2) "Re: Modifying objects during mps_formatted_objects_walk"; Gareth Rees; 2020-08-30; <https://info.ravenbrook.com/mail/2020/08/31/19-17-03/0/>.
[Strömbäck_2020-08-20]"Modifying objects during mps_formatted_objects_walk"; Filip Strömbäck; 2020-08-20; <https://info.ravenbrook.com/mail/2020/08/20/21-01-34/0/>.

Document History