Garbage Collection and Weak References

Garbage collection is obviously a core function of the runtime. Its purpose is to restore memory consumed by objects that are no longer referenced. The emphasis in this statement is on memory and references: The garbage collector is responsible only for restoring memory; it does not handle other resources such as database connections, handles (files, windows, etc.), network ports, and hardware devices such as serial ports. Also, the garbage collector determines what to clean up, based on whether any references remain. Implicitly, this means that the garbage collector works with reference objects and restores memory on the heap only. Additionally, it means that maintaining a reference to an object will delay the garbage collector from reusing the memory consumed by the object.

AdVanced Topic
Garbage Collection in .NET

Many details about the garbage collector pertain to the specific CLI framework and therefore could vary. This section discusses the Microsoft .NET framework implementations because they are the most prevalent.

In .NET, the garbage collector uses a mark-and-compact algorithm. At the beginning of an iteration, it identifies all root references to objects. Root references are any references from static variables, CPU registers, and local variables or parameter instances (and f-reachable objects, as described later in this section). Given this list, the garbage collector is able to walk through the tree identified by each root reference and determine recursively all the objects to which the root references point. In this manner, the garbage collector creates a graph of all reachable objects.

Instead of enumerating all the inaccessible objects, the garbage collector performs garbage collection by compacting all reachable objects next to one another, thereby overwriting any memory consumed by objects that are inaccessible (and thus qualify as garbage).

Locating and moving all reachable objects requires that the system maintain a consistent state while the garbage collector runs. To achieve this, all managed threads within the process are halted during garbage collection. Obviously, this behavior can result in brief pauses in an application, which are generally insignificant unless a particularly large garbage collection cycle is necessary or done quite often. To reduce the likelihood of a garbage collection cycle occurring at an inopportune time, the System.GC object includes a Collect() method, which can be called immediately before the critical performing code. This method does not prevent the garbage collector from running, but it does reduce the probability that it will run, assuming no intense memory utilization occurs during the critical performance code.

One perhaps surprising aspect of .NET garbage collection behavior is that not all garbage is necessarily cleaned up during an iteration. Studies of object lifetimes reveal that recently created objects are more likely to need garbage collection than long-standing objects. Capitalizing on this behavior, the .NET garbage collector is generational, attempting to clean up short-lived objects more frequently than objects that have already survived a previous garbage collection iteration. Specifically, objects are organized into three generations. Each time an object survives a garbage collection cycle, it is moved to the next generation, until it ends up in generation 2 (counting starts from zero). The garbage collector, then, runs more frequently for objects in generation 0 than it does for objects in generation 2.

Over time, in spite of the trepidation that .NET stirred during its early beta releases when compared with unmanaged code, .NET’s garbage collection has proved extremely efficient. More important, the gains realized in development productivity have far outweighed the costs in development for the few cases where managed code is dropped to optimize particular algorithms.

All references discussed so far are strong references because they maintain an object’s accessibility and prevent the garbage collector from cleaning up the memory consumed by the object. The framework also supports the concept of weak references. Weak references do not prevent garbage collection on an object, but they do maintain a reference so that if the garbage collector does not clean up the object, it can be reused.

Weak references are designed for reference objects that are expensive to create, yet too expensive to keep around. Consider, for example, a large list of objects loaded from a database and displayed to the user. The loading of this list is potentially expensive, and once the user closes the list, it should be available for garbage collection. However, if the user requests the list multiple times, a second expensive load call will always be required. With weak references, it becomes possible to use code to check whether the list has been cleaned up, and if not, to re-reference the same list. In this way, weak references serve as a memory cache for objects. Objects within the cache are retrieved quickly, but if the garbage collector has recovered the memory of these objects, they will need to be re-created.

Once a reference object (or collection of objects) is recognized as worthy of potential weak reference consideration, it needs to be assigned to System.WeakReference (see Listing 10.14).

Listing 10.14: Using a Weak Reference
public static class ByteArrayDataSource
{
    private static byte[] LoadData()
    {
        // Imagine a much lager number
        byte[] data = new byte[1000];
        // Load data
        // ...
        return data;
    }
 
    private static WeakReference<byte[]>? Data { getset; }
 
    public static byte[] GetData()
    {
        byte[]? target;
        if (Data is null)
        {
            target = LoadData();
            Data = new WeakReference<byte[]>(target);
            return target;
        }
        else if (Data.TryGetTarget(out target))
        {
            return target;
        }
        else
        {
            // Reload the data and assign it (creating a strong
            // reference) before setting WeakReference's Target
            // and returning it.
            target = LoadData();
            Data.SetTarget(target);
            return target;
        }
    }
}
 
// ...

Admittedly, this code uses generics, which aren’t discussed in this book until Chapter 12. However, you can safely ignore the <byte[]> text both when declaring the Data property and when assigning it. While there is a nongeneric version of WeakReference, there is little reason to consider it.8

The bulk of the logic appears in the GetData() method. The purpose of this method is to always return an instance of the data—whether from the cache or by reloading it. GetData() begins by checking whether the Data property is null. If it is, the data is loaded and assigned to a local variable called target. This creates a reference to the data so that the garbage collector will not clear it. Next, we instantiate a WeakReference and pass a reference to the loaded data so that the WeakReference object has a handle to the data (its target); then, if requested, such an instance can be returned. Do not pass an instance that does not have a local reference to WeakReference, because it might get cleaned up before you have a chance to return it (i.e., do not call new WeakReference<byte[]>(LoadData())).

If the Data property already has an instance of WeakReference, then the code calls TryGetTarget() and, if there is an instance, assigns target, thus creating a reference so that the garbage collector will no longer clean up the data.

Lastly, if WeakReference’s TryGetTarget() method returns false, we load the data, assign the reference with a call to SetTarget(), and return the newly instantiated object.

________________________________________

8. Unless programming with .NET Framework 4.5 or earlier.
{{ snackbarMessage }}
;