3. Garbage collecting a language with the Memory Pool System¶

Have you written the lexer, parser, code generator and the runtime system for your programming language, and come to the realization that you are going to need a memory manager too? If so, you’ve come to the right place.

In this guide, I’ll explain how to use the MPS to add incremental, moving, generational garbage collection to the runtime system for a programming language.

I’m assuming that you are familiar with the overall architecture of the MPS (see the chapter Overview of the Memory Pool System) and that you’ve downloaded and built the MPS (see the chapter Building the Memory Pool System).

3.1. The Scheme interpreter¶ As a running example throughout this guide, I’ll be using a small interpreter for a subset of the Scheme programming language. I’ll be quoting the relevant sections of code as needed, but you may find it helpful to experiment with this interpreter yourself, in either of its versions: scheme-malloc.c The toy Scheme interpreter before integration with the MPS, using malloc and free (2) for memory management. scheme.c The toy Scheme interpreter after integration with the MPS. This simple interpreter allocates two kinds of objects on the heap: All Scheme objects (there are no unboxed objects). The global symbol table: a hash table consisting of a vector of pointers to strings. A Scheme object (whose type is not necessarily known) is represented by an obj_t , which is a pointer to a union of every type in the language: typedef union obj_u * obj_t ; typedef union obj_u { type_s type ; pair_s pair ; symbol_s symbol ; integer_s integer ; special_s special ; operator_s operator ; string_s string ; port_s port ; character_s character ; vector_s vector ; table_s table ; buckets_s buckets ; } obj_s ; Each of these types is a structure whose first word is a number specifying the type of the object ( TYPE_PAIR for pairs, TYPE_SYMBOL for symbols, and so on). For example, pairs are represented by a pointer to the structure pair_s defined as follows: typedef struct pair_s { type_t type ; /* TYPE_PAIR */ obj_t car , cdr ; /* first and second projections */ } pair_s ; Because the first word of every object is its type, functions can operate on objects generically, testing TYPE(obj) as necessary (which is a macro for obj->type.type ). For example, the print() function is implemented like this: static void print ( obj_t obj , unsigned depth , FILE * stream ) { switch ( TYPE ( obj )) { case TYPE_INTEGER : fprintf ( stream , "%ld" , obj -> integer . integer ); break ; case TYPE_SYMBOL : fputs ( obj -> symbol . string , stream ); break ; /* ... and so on for the other types ... */ } } Each constructor allocates memory for the new object by calling malloc . For example, make_pair is the constructor for pairs: static obj_t make_pair ( obj_t car , obj_t cdr ) { obj_t obj = ( obj_t ) malloc ( sizeof ( pair_s )); if ( obj == NULL ) error ( "out of memory" ); obj -> pair . type = TYPE_PAIR ; CAR ( obj ) = car ; CDR ( obj ) = cdr ; return obj ; } Objects are never freed, because it is necessary to prove that they are dead before their memory can be reclaimed. To prove that they are dead, we need a tracing garbage collector, which the MPS will provide.

3.2. Choosing an arena class¶ You’ll recall from the Overview of the Memory Pool System that the functionality of the MPS is divided between the arenas, which request memory from (and return it to) the operating system, and pools, which allocate blocks of memory for your program. There are two main classes of arena: the client arena, mps_arena_class_cl() , which gets its memory from your program, and the virtual memory arena, mps_arena_class_vm() , which gets its memory from the operating system’s virtual memory interface. The client arena is intended for use on embedded systems where there is no virtual memory, and has a couple of disadvantages (you have to decide how much memory you are going to use; and the MPS can’t return memory to the operating system for use by other processes) so for general-purpose programs you’ll want to use the virtual memory arena. You’ll need a couple of headers: mps.h for the MPS interface, and mpsavm.h for the virtual memory arena class: #include "mps.h" #include "mpsavm.h" There’s only one arena, and many MPS functions take an arena as an argument, so it makes sense for the arena to be a global variable rather than having to pass it around everywhere: static mps_arena_t arena ; Create an arena by calling mps_arena_create() . This function takes a third argument when creating a virtual memory arena: the size of the amount of virtual virtual address space (not RAM), in bytes, that the arena will reserve initially. The MPS will ask for more address space if it runs out, but the more times it has to extend its address space, the less efficient garbage collection will become. The MPS works best if you reserve an address space that is several times larger than your peak memory usage. Let’s reserve 32 megabytes: mps_res_t res ; res = mps_arena_create ( & arena , mps_arena_class_vm (), ( size_t )( 32 * 1024 * 1024 )); if ( res != MPS_RES_OK ) error ( "Couldn't create arena" ); mps_arena_create() is typical of functions in the MPS interface in that it stores its result in a location pointed to by an out parameter (here, &arena ) and returns a result code, which is MPS_RES_OK if the function succeeded, or some other value if it failed. Note The MPS is designed to co-operate with other memory managers, so when integrating your language with the MPS you need not feel obliged to move all your memory management to the MPS: you can continue to use malloc and free to manage some of your memory, for example, while using the MPS for the rest. The toy Scheme interpreter illustrates this by continuing to use malloc and free to manage its global symbol table. Topics Arenas, Error handing.

3.5. Generation chains¶ The AMC pool requires not only an object format but a generation chain. This specifies the generation structure of the generational garbage collection. You create a generation chain by constructing an array of structures of type mps_gen_param_s , one for each generation, and passing them to mps_chain_create() . Each of these structures contains two values, the capacity of the generation in kilobytes, and the mortality, the proportion of objects in the generation that you expect to survive a collection of that generation. These numbers are hints to the MPS that it may use to make decisions about when and what to collect: nothing will go wrong (other than suboptimal performance) if you make poor choices. Making good choices for the capacity and mortality of each generation is not easy, and is postponed to the chapter Tuning the Memory Pool System for performance. Here’s the code for creating the generation chain for the toy Scheme interpreter: mps_gen_param_s obj_gen_params [] = { { 150 , 0.85 }, { 170 , 0.45 }, }; res = mps_chain_create ( & obj_chain , arena , LENGTH ( obj_gen_params ), obj_gen_params ); if ( res != MPS_RES_OK ) error ( "Couldn't create obj chain" ); Note that these numbers have have been deliberately chosen to be small, so that the MPS is forced to collect often, so that you can see it working. Don’t just copy these numbers unless you also want to see frequent garbage collections! Topic Garbage collection.

3.6. Creating the pool¶ Now you know enough to create an AMC (Automatic Mostly-Copying) pool! Let’s review the pool creation code. First, the header for the AMC pool class: #include "mpscamc.h" Second, the object format: struct mps_fmt_A_s obj_fmt_s = { sizeof ( mps_word_t ), obj_scan , obj_skip , NULL , obj_fwd , obj_isfwd , obj_pad , }; mps_fmt_t obj_fmt ; res = mps_fmt_create_A ( & obj_fmt , arena , & obj_fmt_s ); if ( res != MPS_RES_OK ) error ( "Couldn't create obj format" ); Third, the generation chain: mps_gen_param_s obj_gen_params [] = { { 150 , 0.85 }, { 170 , 0.45 }, }; mps_chain_t obj_chain ; res = mps_chain_create ( & obj_chain , arena , LENGTH ( obj_gen_params ), obj_gen_params ); if ( res != MPS_RES_OK ) error ( "Couldn't create obj chain" ); And finally the pool: mps_pool_t obj_pool ; res = mps_pool_create ( & obj_pool , arena , mps_class_amc (), obj_fmt , obj_chain ); if ( res != MPS_RES_OK ) error ( "Couldn't create obj pool" );

3.7. Roots¶ The object format tells the MPS how to find references from one object to another. This allows the MPS to extrapolate the reachability property: if object A is reachable, and the scan method fixes a reference from A to another object B, then B is reachable too. But how does this process get started? How does the MPS know which objects are reachable a priori? Such objects are known as roots, and you must register them with the MPS, creating root descriptions of type mps_root_t . The most important root consists of the contents of the registers and the control stack of each thread in your program: this is covered in Threads, below. Other roots may be found in static variables in your program, or in memory allocated by other memory managers. For these roots you must describe to the MPS how to scan them for references. The toy Scheme interpreter has a number of static variables that point to heap-allocated objects. First, the special objects, including: static obj_t obj_empty ; /* (), the empty list */ Second, the predefined symbols, including: static obj_t obj_quote ; /* "quote" symbol */ And third, the global symbol table: static obj_t * symtab ; static size_t symtab_size ; You tell the MPS how to scan these by writing root scanning functions of type mps_reg_scan_t . These functions are similar to the scan method in an object format, described above. In the case of the toy Scheme interpreter, the root scanning function for the special objects and the predefined symbols could be written like this: static mps_res_t globals_scan ( mps_ss_t ss , void * p , size_t s ) { MPS_SCAN_BEGIN ( ss ) { FIX ( obj_empty ); /* ... and so on for the special objects ... */ FIX ( obj_quote ); /* ... and so on for the predefined symbols ... */ } MPS_SCAN_END ( ss ); return MPS_RES_OK ; } but in fact the interpreter already has tables of these global objects, so it’s simpler and more extensible for the root scanning function to iterate over them: static mps_res_t globals_scan ( mps_ss_t ss , void * p , size_t s ) { MPS_SCAN_BEGIN ( ss ) { size_t i ; for ( i = 0 ; i < LENGTH ( sptab ); ++ i ) FIX ( * sptab [ i ]. varp ); for ( i = 0 ; i < LENGTH ( isymtab ); ++ i ) FIX ( * isymtab [ i ]. varp ); } MPS_SCAN_END ( ss ); return MPS_RES_OK ; } Each root scanning function must be registered with the MPS by calling mps_root_create() , like this: mps_root_t globals_root ; res = mps_root_create ( & globals_root , arena , mps_rank_exact (), 0 , globals_scan , NULL , 0 ); if ( res != MPS_RES_OK ) error ( "Couldn't register globals root" ); The third argument (here mps_rank_exact() ) is the rank of references in the root. “Exact” means that: each reference in the root is a genuine pointer to another object managed by the MPS, or else a null pointer (unlike ambiguous references); and each reference keeps the target of the reference alive (unlike weak references (1)). The fourth argument is the root mode, which tells the MPS whether it is allowed to place a barrier (1) on the root. The root mode 0 means that it is not allowed. The sixth and seventh arguments (here NULL and 0 ) are passed to the root scanning function where they are received as the parameters p and s respectively. In this case there was no need to use them. What about the global symbol table? This is trickier, because it gets rehashed from time to time, and during the rehashing process there are two copies of the symbol table in existence. Because the MPS is asynchronous, it might be scanning, moving, or collecting, at any point in time, and if it is doing so during the rehashing of the symbol table it had better scan both the old and new copies of the table. This is most conveniently done by registering a new root to refer to the new copy, and then after the rehash has completed, de-registering the old root by calling mps_root_destroy() . It would be possible to write a root scanning function of type mps_reg_scan_t , as described above, to fix the references in the global symbol table, but the case of a table of references is sufficiently common that the MPS provides a convenient (and optimized) function, mps_root_create_table() , for registering it: static mps_root_t symtab_root ; /* ... */ mps_addr_t ref = symtab ; res = mps_root_create_table ( & symtab_root , arena , mps_rank_exact (), 0 , ref , symtab_size ); if ( res != MPS_RES_OK ) error ( "Couldn't register new symtab root" ); The root must be re-registered whenever the global symbol table changes size: static void rehash ( void ) { obj_t * old_symtab = symtab ; unsigned old_symtab_size = symtab_size ; mps_root_t old_symtab_root = symtab_root ; unsigned i ; mps_addr_t ref ; mps_res_t res ; symtab_size *= 2 ; symtab = malloc ( sizeof ( obj_t ) * symtab_size ); if ( symtab == NULL ) error ( "out of memory" ); /* Initialize the new table to NULL so that "find" will work. */ for ( i = 0 ; i < symtab_size ; ++ i ) symtab [ i ] = NULL ; ref = symtab ; res = mps_root_create_table ( & symtab_root , arena , mps_rank_exact (), 0 , ref , symtab_size ); if ( res != MPS_RES_OK ) error ( "Couldn't register new symtab root" ); for ( i = 0 ; i < old_symtab_size ; ++ i ) if ( old_symtab [ i ] != NULL ) { obj_t * where = find ( old_symtab [ i ] -> symbol . string ); assert ( where != NULL ); /* new table shouldn't be full */ assert ( * where == NULL ); /* shouldn't be in new table */ * where = old_symtab [ i ]; } mps_root_destroy ( old_symtab_root ); free ( old_symtab ); } Notes The old root description (referring to the old copy of the symbol table) is not destroyed until after the new root description has been registered. This is because the MPS is asynchronous: it might be scanning, moving, or collecting, at any point in time. If the old root description were destroyed before the new root description was registered, there would be a period during which: the symbol table was not reachable (at least as far as the MPS was concerned) and so all the objects referenced by it (and all the objects reachable from those objects) might be dead; and if the MPS moved an object, it would not know that the object was referenced by the symbol table, and so would not update the reference there to point to the new location of the object. This would result in out-of-date references in the old symbol table, and these would be copied into the new symbol table. The root might be scanned as soon as it is registered, so it is important to fill it with scannable references ( NULL in this case) before registering it. The order of operations at the end is important: the old root must be de-registered before its memory is freed. Topic Roots.

3.8. Threads¶ In a multi-threaded environment where incremental garbage collection is used, you must register each of your threads with the MPS so that the MPS can examine their state. Even in a single-threaded environment (like the toy Scheme interpreter) it may also be necessary to register the (only) thread if either of these conditions apply: you are using moving garbage collection (as with the AMC (Automatic Mostly-Copying) pool); the thread’s registers and control stack constitute a root (that is, objects may be kept alive via references in local variables: this is almost always the case for programs written in C). You register a thread with an arena by calling mps_thread_reg() : mps_thr_t thread ; res = mps_thread_reg ( & thread , arena ); if ( res != MPS_RES_OK ) error ( "Couldn't register thread" ); You register the thread’s registers and control stack as a root by calling mps_root_create_reg() and passing mps_stack_scan_ambig() : void * marker = & marker ; mps_root_t reg_root ; res = mps_root_create_reg ( & reg_root , arena , mps_rank_ambig (), 0 , thread , mps_stack_scan_ambig , marker , 0 ); if ( res != MPS_RES_OK ) error ( "Couldn't create root" ); In order to scan the control stack, the MPS needs to know where the bottom of the stack is, and that’s the role of the marker variable: the compiler places it on the stack, so its address is a position within the stack. As long as you don’t exit from this function while the MPS is running, your program’s active local variables will always be higher up on the stack than marker , and so will be scanned for references by the MPS. Topic Threads.

3.9. Allocation¶ It probably seemed a long journey to get here, but at last we’re ready to start allocating. Manual pools typically support malloc-like allocation using the function mps_alloc() . But automatic pools cannot, because of the following problem: static obj_t make_pair ( obj_t car , obj_t cdr ) { obj_t obj ; mps_addr_t addr ; mps_res_t res ; res = mps_alloc ( & addr , pool , sizeof ( pair_s )); if ( res != MPS_RES_OK ) error ( "out of memory in make_pair" ); obj = addr ; /* What happens if the MPS scans obj just now? */ obj -> pair . type = TYPE_PAIR ; CAR ( obj ) = car ; CDR ( obj ) = cdr ; return obj ; } Because the MPS is asynchronous, it might scan any reachable object at any time, including immediately after the object has been allocated. In this case, if the MPS attempts to scan obj at the indicated point, the object’s type field will be uninitialized, and so the scan method may abort. The MPS solves this problem via the fast, nearly lock-free Allocation point protocol. This needs an additional structure, an allocation point, to be attached to the pool by calling mps_ap_create() : static mps_ap_t obj_ap ; /* ... */ res = mps_ap_create ( & obj_ap , obj_pool , mps_rank_exact ()); if ( res != MPS_RES_OK ) error ( "Couldn't create obj allocation point" ); And then the constructor can be implemented like this: static obj_t make_pair ( obj_t car , obj_t cdr ) { obj_t obj ; mps_addr_t addr ; size_t size = ALIGN ( sizeof ( pair_s )); do { mps_res_t res = mps_reserve ( & addr , obj_ap , size ); if ( res != MPS_RES_OK ) error ( "out of memory in make_pair" ); obj = addr ; obj -> pair . type = TYPE_PAIR ; CAR ( obj ) = car ; CDR ( obj ) = cdr ; } while ( ! mps_commit ( obj_ap , addr , size )); return obj ; } The function mps_reserve() allocates a block of memory that the MPS knows is uninitialized: the MPS promises not to scan this block or move it until after it is committed (2) by calling mps_commit() . So the new object can be initialized safely. However, there’s a second problem: CAR ( obj ) = car ; CDR ( obj ) = cdr ; /* What if the MPS moves car or cdr just now? */ } while ( ! mps_commit ( obj_ap , addr , size )); Because obj is not yet committed, the MPS won’t scan it, and that means that it won’t discover that it contains references to car and cdr , and so won’t update these references to point to their new locations. In such a circumstance (that is, when objects have moved since you called mps_reserve() ), mps_commit() returns false, and we have to initialize the object again (most conveniently done via a while loop, as here). Notes When using the Allocation point protocol it is up to you to ensure that the requested size is aligned, because mps_reserve() is on the MPS’s critical path, and so it is highly optimized: in nearly all cases it is just an increment to a pointer and a test. It is very rare for mps_commit() to return false, but in the course of millions of allocations even very rare events occur, so it is important not to do anything you don’t want to repeat between calling mps_reserve() and mps_commit() . Also, the shorter the interval, the less likely mps_commit() is to return false. Topic Allocation.

3.11. Tidying up¶ When your program is done with the MPS, it’s good practice to tear down all the MPS data structures. This causes the MPS to check the consistency of its data structures and report any problems it detects. It also causes the MPS to flush its telemetry stream. MPS data structures must be destroyed or deregistered in the reverse order to that in which they were registered or created. So you must destroy all allocation points created in a pool before destroying the pool; destroy all roots and pools, and deregister all threads, that were created in an arena before destroying the arena, and so on. Here’s the tear-down code from the toy Scheme interpreter: mps_ap_destroy ( obj_ap ); mps_pool_destroy ( obj_pool ); mps_chain_destroy ( obj_chain ); mps_fmt_destroy ( obj_fmt ); mps_root_destroy ( reg_root ); mps_thread_dereg ( thread ); mps_arena_destroy ( arena );