Even after the runtime converts the CIL code to machine code and starts to execute it, it continues to maintain control of the execution. The code that executes under the context of an agent such as the runtime is managed code, and the process of executing under control of the runtime is managed execution. The control over execution transfers to the data; this makes it managed data because memory for the data is automatically allocated and de-allocated by the runtime.
Somewhat inconsistently, the term Common Language Runtime is not technically a generic term that is part of the CLI. Rather, CLR is the Microsoft-specific implementation of the runtime for the .NET framework. Regardless, CLR is casually used as a generic term for runtime, and the technically accurate term, Virtual Execution System, is seldom used outside the context of the CLI specification.
Because an agent controls program execution, it is possible to inject additional services into a program, even though programmers did not explicitly code for them. Managed code, therefore, provides information to allow these services to be attached. Among other items, managed code enables the location of metadata about a type member, exception handling, access to security information, and the capability to walk the stack. The remainder of this section includes a description of some additional services made available via the runtime and managed execution. The CLI does not explicitly require all of them, but the established CLI frameworks have an implementation of each.
Garbage collection is the process of automatically de-allocating memory according to the program’s needs. It represents a significant programming problem for languages that don’t have an automated system for performing this cleanup. Without the garbage collector, programmers must remember to always free any memory allocations they make. Forgetting to do so, or doing so repeatedly for the same memory allocation, can introduce memory leaks or corruption into the program—something exacerbated by long-running programs such as web servers. Because of the runtime’s built-in support for garbage collection, programmers targeting runtime execution can focus on adding program features rather than on the “plumbing” related to memory management.
The exact mechanics for how the garbage collector works are not part of the CLI specification; therefore, each implementation can take a slightly different approach. (In fact, garbage collection is one item not explicitly required by the CLI.) One key concept with which C++ programmers may need to become familiar is the notion that garbage-collected objects are not necessarily collected deterministically (at well-defined, compile-time–known locations). In fact, objects can be garbage-collected anytime between when they are last accessed and when the program shuts down. This includes collection prior to falling out of scope and collection well after an object instance is accessible by the code.
The garbage collector takes responsibility only for handling memory management; that is, it does not provide an automated system for managing resources unrelated to memory. Therefore, if an explicit action to free a resource (other than memory) is required, programmers using that resource should utilize special CLI-compatible programming patterns that will aid in the cleanup of those resources (see Chapter 10).
For those reading this chapter out of order, most implementations of the CLI use a generational, compacting, mark-and-sweep–based algorithm to reclaim memory. It is “generational” because objects that have lived for only a short period will be cleaned up sooner than objects that have already survived garbage collection sweeps because they were still in use. This convention conforms to the general pattern of memory allocation, in which objects that have been around longer will continue to outlive objects that have only recently been instantiated.
Additionally, the .NET garbage collector uses a mark-and-sweep algorithm. During each garbage collection execution, it marks objects that are to be de-allocated and compacts together the objects that remain so that there is no “dirty” space between them. The use of compression to fill in the space left by de-allocated objects often results in faster instantiation of new objects (than is possible with unmanaged code), because it is not necessary to search through memory to locate space for a new allocation. Compression also decreases the chance of paging because more objects are located in the same page, which improves performance as well.
The garbage collector takes into consideration the resources on the machine and the demand on those resources at execution time. For example, if memory on the computer is still largely untapped, the garbage collector is less likely to run and take time to clean up those resources. This optimization is rarely taken by execution environments and languages that are not based on garbage collection.
Similarly, if you are reading this chapter out of order, you may not be aware as yet that one of the key advantages offered by the runtime is checking conversions between types, known as type checking. Via type checking, the runtime prevents programmers from unintentionally introducing invalid casts that can lead to buffer overrun vulnerabilities. Such vulnerabilities are one of the most common means of breaking into a computer system—and having the runtime automatically prevent these holes from opening is a significant gain.2 Type checking provided by the runtime ensures the following:
Given appropriate permissions, it is possible to circumvent encapsulation and access modifiers via a mechanism known as reflection. Reflection provides late binding by enabling support for browsing through a type’s members, looking up the names of particular constructs within an object’s metadata, and invoking the type’s members.
C# programs are platform-portable, supporting execution across different operating systems (cross-platform support)—that is, C# programs are capable of running on multiple operating systems and executing on different CLI implementations. Portability in this context is not limited to recompiling source code for each platform, but rather a single CLI module compiled for one framework can run on any CLI-compatible framework without needing to be recompiled. This portability occurs because the work of porting the code lies in the hands of the runtime implementation rather than the application developer (thanks to the .NET Standard). The restriction is, of course, that no platform-specific APIs can be used in your cross-platform code. When developing a cross-platform application, developers can package, or refactor, common code into cross-platform–compatible libraries and then call the libraries from platform-specific code to reduce the total amount of code required to support cross-platform applications.
Many programmers accustomed to writing unmanaged code will correctly point out that managed environments impose overhead on applications, no matter how simple they are. The trade-off is one of increased development productivity and reduced bugs in managed code versus runtime performance. The same dichotomy emerged as programming went from assembler to higher-level languages such as C, and from structured programming to object-oriented development. In the majority of scenarios, development productivity wins out, especially as the speed and reduced price of hardware surpass the demands of applications. Time spent on architectural design is much more likely to yield big performance gains than the complexities of low-level development. In the climate of security holes caused by buffer overruns, managed execution is even more compelling.
Undoubtedly, certain development scenarios (e.g., device drivers) may not yet fit with managed execution. However, as managed execution increases in capability and sophistication, many of these performance considerations will likely vanish. Unmanaged execution will then be reserved for development where precise control or circumvention of the runtime is deemed necessary.3
Furthermore, the runtime introduces several factors that can contribute to improved performance over native compilation. For example, because translation to machine code takes place on the destination machine, the resultant compiled code matches the processor and memory layout of that machine, resulting in performance gains generally not leveraged by non-jitted languages. Also, the runtime is able to respond to execution conditions that direct compilation to machine code rarely takes into account. If, for example, the box has more memory than is required, unmanaged languages will still de-allocate their memory at deterministic, compile-time–defined execution points in the code. Alternatively, JIT-compiled languages will need to de-allocate memory only when it is running low or when the program is shutting down. Even though jitting can add a compile step to the execution process, code efficiencies that a jitter can insert may lead to improved performance rivaling that of programs compiled directly to machine code. Ultimately, CLI programs are not necessarily faster than non-CLI programs, but their performance is competitive.
________________________________________