An Artificial Garbage Collection Benchmark

The directory gc_bench contains an artificial garbage collector benchmark. The original benchmark was written by John Ellis and Pete Kovac of Post Communications. It was then heavily modified by Hans Boehm, then at SGI. It was translated by William Clinger of Northeastern University to Scheme and C++. Hans Boehm, now at HP, produced the C translation, and added the -DGC option to the C and C++ versions to allow use with the Boehm/Demers/Weiser conservative GC.

The benchmark should run in roughly 32MB or less on most systems, and some systems run it in about half that. A Pentium III/500 runs the garbage collected C version in under 8 seconds. Some Java implementations have similar or better performance. Some are an order of magnitude slower.

In addition to an overall execution time, the benchmark reports times required to allocate and drop complete binary trees of various sizes. All reported times are for similar amounts of allocation, and maintain some more permanent live data structures. Generational collectors generally exhibit much better performance for the smaller tree sizes, where nongenerational collectors tend to have a flatter performance profile.

This benchmark appears to have been used by a number of vendors to aid in Java VM development. That probably makes it less desirable as a means to compare VMs. (It also has some know deficiencies, e.g. the allocation pattern is too regular, and it leaves too few "holes" between live objects.) It now appears to be most useful as a sanity test for garbage collector developers.

Profiling

The directory also contains a small utility that can be used to get instruction-level profiles on some Unix-like systems. We have found this useful to get a handle on cache issues on systems which do not provide more elaborate support for such profiling. The C version of the benchmark has an option to generate such profiles.

(The code has been used on X86 and IA64 Linux systems. It requires minimal porting for other systems.)

Code to be profiled should call init_profiling() at startup and dump_profile() before termination. This generates a list of addresses and counts on stderr if this redirected into prof.out, something like the following will generate a readable version of the profile:

nm {option for BSD-style output} {executable} > nm.out
cat nm.out prof.out | sort > profile