Why Conservative Garbage Collectors?

This is a very rough draft. Some references and attributions are missing. Hudson, Moss and others measured safe point space costs. David Chase pointed out that the testability advantages of copying, though he may not have been the first to do so. Etc.)

Conservative garbage collectors have traditionally been used in conjunction with programming languages like C and C++, which were not originally designed with a garbage collector in mind. In such an environment they typically provide the only feasible solution. Even if one were willing to rewrite the compiler from the ground up to provide pointer identification information for the garbage collector, some language features (notably unions and pointers to union fields) make it very difficult to provide this information at reasonable cost.

Here we address the tradeoffs for implementations of languages designed with garbage collection in mind. For these languages, completely conservative collectors are clearly not optimal. But complete pointer identification information, i.e. precise collection may also be too expensive, making hybrids interesting.

We will largely ignore implementation convenience issues here. These tend to heavily favor conservative collectors.

We will assume that either kind of collector is implemented correctly, with sufficient compiler support. In the case of a conservative collector that requires some minimal precautions to ensure that all objects that can be accessed by the generated code appear reachable to the collector. (Details) In the case of a precise collector, it requires much more elaborate compiler-collector interactions.

We will assume that we have type information available for at least a large fraction of the bytes in the heap, as well as for most statically allocated objects. Type information for static variables involves essentially no runtime cost and minimal space cost. In many environments, type descriptors for heap objects are either already present, or can be easily and cheaply provided. (These descriptors may not exist as bits in memory. The same effect can often be obtained by segregating object types in memory.)

We will assume an environment that includes threads, which may be preempted. Although specialized single-thread implementations are also interesting for some applications, it is more and more essential that any serious programming environment at least optionally support threads. And garbage collection is even more important in a multi-threaded environment.

Thus the issue is primarily whether we accommodate ambiguous pointers from the execution stacks. if we wish to eliminate such ambiguous pointers, we need to ensure that the garbage collector is activated only when all execution points (i.e. all program counter values and return addresses) are at points at which the garbage collector can locate pointers in the local frame. These execution points are normally called safe points.

Costs are incurred as a result of introducing safe points into code and providing the associated information to the garbage collector. We will be interested primarily in two kinds of partially conservative collectors that can reduce these safe point costs. Both can also accommodate additional ambiguous pointers, for example, to accommodate easier C/C++ intercallability:

Thread stacks are scanned conservatively. Thus no safe points are needed. (Heap objects and statically allocated objects still need descriptors.)
Only the top frame of each thread stack is scanned conservatively. Thus safe points are only needed at procedure call sites. This both reduces their frequency, and places them where the amount of information that needs to be communicated to the collector is likely to be relatively small.

The costs of conservative garbage collection

Conservative pointer identification in the garbage collector does appear to have some costs:

Pointer identification cost. Empirical evidence suggests that the time cost of dismissing a nonpointer in a conservative collector is often not much different from that in a nonconservative collector. Most nonpointers are quickly recognized. However, it is fairly expensive (on the order of half a dozen memory references) for a conservative collector to verify that something is the address of an object. If all pointer fields are known to either be NIL or point to the heap, this is much less expensive for a precise collector. On the other side, making all pointers reference the heap often requires extra copying, e.g. for strings.
Unpredictability of space use Although space utilization of conservative garbage collectors is typically indistinguishable from more precise collectors, it is hard to give hard bounds on space utilization. This is nearly irrelevant for some applications and completely unacceptable for others.
Note however that there are some circumstances under which space use of a conservative collector is bounded. See here for details.
Higher cost of short-lived allocation Conventional precise garbage collectors can support a form of generational collection in which young objects can be moved and thus physically separated from older objects, and remembered sets can be maintained precisely.
All of these help to facilitate very cheap reclamation of very short-lived objects. Physical separation makes it easy to identify and reclaim garbage after live young objects have been traced. The fact that surviving objects can be moved makes allocation itself very cheap. Precise remembered set information makes it cheaper to identify surviving young objects.
Fully conservative collectors cannot practically move objects to segregate the young ones. This is however less true for partially conservative collectors, e.g. Bartlett's mostly copying collector.
Conservative collectors typically cannot obtain precise remembered set information, since they cannot get mutator cooperation for pointer writes. Instead our collector maintains "page dirty" information with the aid of the virtual memory system. This only localizes potential pointers to young objects within a physical page. If the write-to-allocation ratio is low, it can also be relatively expensive to maintain. If pointer writes are frequent, even a precise collector might be forced to use this kind of strategy, however.
Neither of these effects is likely to be significant in the absence of generational collection. Remembered sets are not needed. For reasonable heap sizes, garbage collection copying costs will outweigh allocation costs. But both effects are likely to reduce the benefits of generational garbage collection.

Overall, empirical evidence suggests that conservative collectors, even fully conservative collectors, are performance competitive except for programs which allocate larger volumes of very short-lived objects than C or C++ programs are likely to. They are usually competitive on space usage, though the weak guarantees are occasionally problematic. There are rare occasions on which fully conservative collectors, i.e. collectors that conservatively scan the heap, retain too much memory. These can be addressed by providing limited type information.

The Costs of Precise Pointer Identification

Precise collectors also have a number of performance costs, which often outweigh those of conservative pointer identification.

Ease of inter-language interoperability Any practical programming environment must support easy calls into third party library applications, which are typically provided as binaries generated from C or C++ source code. A conservative collector can support such inter-language calls very conveniently. The garbage collector automatically considers pointers in C or C++ local variables. Pointers elsewhere can be trivially added to the garbage collector's root set, possibly even by default without programmer intervention. Native code can allocate memory in a section of the heap traced by the garbage collector.
(Technically, this requires that the libraries have been suitably compiled so as not to hide pointers from the collector. Empirically, this is essentially always true.)
This problem becomes much more difficult, and requires a much more elaborate protocol (e.g. the Java Native Interface), if a precise collector must be supported. Failure to follow the protocol, such as failure to register a cached object reference, leads to intermittent heap corruption.
Performance penalty for non-allocating code Inserting safe points in performance critical, non-allocating loops can be expensive, especially if the conventions used to communicate with the garbage collector disallow derived pointers in registers at the safe point. (Derived pointers are pointers used to refer to an object that are not the base address of the object.) Such a restriction would, for example, disallow most standard induction variable optimizations, as well as preventing motion of some loop invariant array access code out of loops. This is much less of a problem if derived pointers are allowed at safe points, though some more esoteric optimizations may still become impossible, and manual optimization for critical code becomes risky.
Precise garbage collectors generally do require safe points in all loops to ensure that any thread can be advanced to a safe point in bounded time.
Performance of inter-language interoperability The protocol required for inter-language calls with a precise collector can incur significant overhead. In the case of a conservative collector, the protocol is essentially nonexistent, and hence free.
Potentially higher reliability One of the more important clients of C calls is usually the language run-time system. Complexity of the protocol increases the probability of undetected errors. Similarly errors in safe-point descriptors may not be promptly recognized.
(Copying garbage collectors do make it a bit easier to detect such errors more promptly when they are exercised, since it is easier to trap references to "garbage". A nonmoving collector will not be able to unmap all garbage. But it is still necessary to find test cases that exercise the problem.)
Size of descriptor information, need to single-step It is probably too expensive to make every program point a safe-point. Thus a precise collector must be able to single-step a process to the next safe point. Depending on the environment, this may be nontrivial.

The Bottom Line

The costs associated with the two approaches are not easily comparable. Very allocation intensive programs with hard space requirements, short object lifetimes and no calls to C/C++ libraries tend to favor precise collectors. Numerical inner loops or C/C++ calls favor conservative collection.