Garbage Collection

In this assignment, you will implement two garbage collectors: mark & sweep, and stop & copy. As we have seen, garbage collection involves starting from a root set of references, which reside in local variables and stack frames, and searching for reachable values on the heap. We have developed support code that gives your collector access to a running program's root set. You just have to implement the garbage-collection algorithms.

For this assignment, you will no longer be using the PLAI language levels. Instead, set the language level to (module ...).

Download the support code for the garbage collector. This includes several files:

memory-core.ss

You will need to require this file into your allocator. It defines the following abstractions:

allocator.ss

We provide a trivial allocator that does not perform garbage collection (i.e., it just gives up after running out of memory). You will need to provide your own allocator modules—one for each collection algorithm—that implement the same interface. You can implement both of your collectors in the same module and use a flag to select between them, or you can make separate files and create a symbolic link to allocator.ss. The interface is as follows:
gc:set-heap-size! : num -> () This procedure should set the size of the heap to the given number. This is the first thing any mutator must do, so no allocation should be possible until this call is made.
gc:alloc-flat : num U sym U bool U empty -> loc This procedure should allocate a flat Scheme value (number, symbol, boolean, or empty list) on the heap, returning its location (a number). The value should occupy a single heap cell, though you may use additional space to store a tag, etc. You are also welcome to pre-allocate common constants (e.g., the empty list). This procedure may need to perform a garbage-collection. If there is still insufficient space, it should raise an error.
gc:number? : loc -> loc This procedure should return the address of a true value iff the content of the given location is a number. Otherwise it should return the address of a false value.
gc:symbol? : loc -> boolean This procedure should return the address of a true value iff the content of the given location is a symbol. Otherwise it should return the address of a false value.
gc:boolean? : loc -> boolean This procedure should return the address of a true value iff the content of the given location is a boolean. Otherwise it should return the address of a false value.
gc:empty? : loc -> boolean This procedure should return the address of a true value iff the content of the given location is an empty list. Otherwise it should return the address of a false value.
gc:deref : loc -> num U sym U bool U empty Given the location of a flat Scheme value, this procedure should return that value; if the location points instead to a cons cell, this function should raise an error.
gc:cons : loc loc -> loc This procedure should allocate a cons cell on the heap, whose first and rest are the given arguments. These fields must be stored in separate heap cells, and you may use additional space for a tag, etc. As with alloc-flat, this procedure may need to perform a garbage-collection, and if there is still insufficient space it should raise an error.
gc:cons? : loc -> boolean This should return true iff the location refers to a cons cell
gc:first : loc -> loc If the given loc refers to a cons cell, this should return the first field. Otherwise, it should raise an error.
gc:rest : loc -> loc If the given loc refers to a cons cell, this should return the rest field. Otherwise, it should raise an error.
gc:loc->scheme-val : loc -> any For debugging: this procedure consumes a location and returns a Scheme value corresponding to the contents of the location.

make-mutator.ss

This module defines a language in which you can implement test applications (mutators) for your collectors. A program that uses this language module will expose its roots through get-root-set, allocate its data on your allocator's heap, and use the versions of primitives defined in your allocator. (When the mutator refers to procedures like cons or symbol?, the calls are automatically redirected to the gc:cons and gc:symbol? defined in your allocator.)

sample-mutator.ss

This serves as an example of a mutator. It takes the form of a module defined in the make-mutator.ss language. It declares a heap size of 100 cells, then defines and applies some simple functions, printing out the results. You will want to test against this module, probably with smaller heap sizes, and also to write your own mutators. When you write your own mutator, make sure the first thing you do is to set the heap size by calling (heap-size size).

Notes

During collection, you will need to use get-root-set to obtain the current set of roots. In gc:alloc-flat, you should not list any additional identifiers. However, in gc:cons, you must provide the arguments (the first and rest fields), since these are roots. (If you don't list them, you will need to treat them specially and be sure to update them with set!).

In the mark & sweep collector, you should maintain a free list in the heap. That is, you should not use any auxiliary data structure; instead, use the available space in the heap to keep track of the free list. You may use one extra box (“register”) to point to the start of the free list. You may need to adjust your allocation accordingly to have enough space to maintain free list pointers!

You will probably want to use numbers to represent locations, but you may use another representation if it eases your implementation. Also, to help see what is happening, you might want to print a representation of the heap before and after every garbage collection. You can use the Scheme printf function:

(printf "Heap is ~a~n" Heap)

Some final words of advice:

FAQ

get-root-set seems to return duplicate roots. Is this a bug in the support code? No, this is the intended behavior. Roots come from walking the stack, and several frames may refer to the same variable. Moreover, different variables may refer to the same heap location. You need to update all of them in your stop & copy collector.

It would be a lot easier if I could just make all objects consume the same amount of space. Would that be acceptable? No! We've been extremely easy-going in only making you handle two sizes. To reduce it to one size would make the problem trivial and unrealistic.