Clearly, the C# compiler generates a significant amount of code for the record struct and record class constructs. However, all the behavior is customizable. As mentioned earlier in the chapter, for example, you can add any additional members, including properties, fields, constructors, and methods. More importantly, you can provide your own versions of the generated members. Coding any record member with a matching signature to the otherwise synthesized member, you can replace the default behavior with your own. If you prefer a read-only property rather than an init-only setter property, you simply declare the property (or field) to match the positional property name. Or, by providing your own copy constructor, you can inject custom behavior into how a record class handles cloning (via the with operator). Similarly, if you want to change the semantics of equality, perhaps to simplify it to a subset of the record’s properties/fields, you can define your own Equals() method, likely only the method that takes a single parameter of the containing type. Listing 9.24 provides several examples of possible customizations on a record.
We know that variables of value types directly contain their data, whereas variables of reference types contain a reference to another storage location. But what happens when a value type is converted to one of its implemented interfaces or to its root base class, object? The result of the conversion must be a reference to a storage location that contains something that looks like an instance of a reference type, but the variable contains a value of value type. Such a conversion, which is known as boxing, has special behavior. Converting a variable of a value type that directly refers to its data to a reference type that refers to a location on the garbage-collected heap involves several steps.
The reverse operation is called unboxing. The unboxing conversion first checks whether the type of the boxed value is the same as the type to which the value is being unboxed, and then results in a copy of the value stored in the heap location.
Boxing and unboxing are important to consider because boxing has some performance and behavioral implications. Besides learning how to recognize these conversions within C# code, a developer can count the box/unbox instructions in a particular snippet of code by looking through the Common Intermediate Language (CIL). Each operation has specific instructions, as shown in Table 9.1.
C# Code |
CIL Code |
static void Main()
{
int number; object thing;
number = 42;
// Boxing thing = number;
// Unboxing number = (int)thing;
return; } |
.method private hidebysig static void Main() cil managed { .entrypoint // Code size 21 (0x15) .maxstack 1 .locals init ([0] int32 number, [1] object thing) IL_0000: nop IL_0001: ldc.i4.s 42 IL_0003: stloc.0 IL_0004: ldloc.0 IL_0005: box [mscorlib]System.Int32 IL_000a: stloc.1 IL_000b: ldloc.1 IL_000c: unbox.any [mscorlib]System.Int32 IL_0011: stloc.0 IL_0012: br.s IL_0014 IL_0014: ret } // end of method Program::Main |
When boxing and unboxing occur infrequently, their implications for performance are irrelevant. However, boxing can occur in some unexpected situations, and frequent occurrences can have a significant impact on performance. Consider Listing 9.25 and Output 9.1. The ArrayList type maintains a list of references to objects, so adding an integer or floating-point number to the list will box the value so that a reference can be obtained.
The code shown in Listing 9.9, when compiled, produces five box instructions and three unbox instructions in the resultant CIL.
Every boxing operation involves both an allocation and a copy; every unboxing operation involves a type check and a copy. Doing the equivalent work using the unboxed type would eliminate the allocation and type check. Obviously, you can easily improve this code’s performance by eliminating many of the boxing operations. Using an object rather than double in the last foreach loop is one such improvement. Another would be to change the ArrayList data type to a generic collection (see Chapter 12). The point being made here is that boxing can be rather subtle, so developers need to pay special attention and notice situations where it could potentially occur repeatedly and affect performance.
Another unfortunate boxing-related problem also occurs at runtime: When calling Add() without first casting to a double (or using a double literal), you could insert integers into the array list. Since ints will implicitly be converted to doubles, this would appear to be an innocuous modification. However, the casts to double when retrieving the value from within the foreach loop would fail. The problem is that the unbox operation is immediately followed by an attempt to perform a memory copy of the value of the boxed int into a double. You cannot do this without first casting to an int, because the code will throw an InvalidCastException at execution time. Listing 9.26 shows a similar error commented out and followed by the correct cast.
C# supports a lock statement for synchronizing code. This statement compiles down to System.Threading.Monitor’s Enter() and Exit() methods, which must be called in pairs. Enter() records the unique reference argument passed so that when Exit() is called with the same reference, the lock can be released. The trouble with using value types is the boxing. Each time Enter() or Exit() is called in such a case, a new value is created on the heap. Comparing the reference of one copy to the reference of a different copy will always return false, so you cannot hook up Enter() with the corresponding Exit(). Therefore, value types in the lock() statement are not allowed.
Listing 9.27 points out a few more runtime boxing idiosyncrasies, and Output 9.2 shows the results.
Listing 9.27 uses the Angle struct and IAngle interface. Note also that the IAngle.MoveTo() interface changes Angle to be mutable. This change brings out some of the idiosyncrasies of mutable value types and, in so doing, demonstrates the importance of the guideline that advocates making structs immutable.
In Example 1 of Listing 9.27, after you initialize angle, you then box it into a variable called objectAngle. Next, Example 2 calls MoveTo() to change _Degrees to 26. However, as the output demonstrates, no change actually occurs the first time. The problem is that to call MoveTo(), the compiler unboxes objectAngle and (by definition) makes a copy of the value. Value types are copied by value—that is why they are called value types. Although the resultant value is successfully modified at execution time, this copy of the value is discarded and nos change occurs on the heap location referenced by objectAngle.
Recall our analogy that suggested variables of value types are like pieces of paper with the value written on them. When you box a value, you make a photocopy of the paper and put the copy in a box. When you unbox the value, you make a photocopy of the paper in the box. Making an edit to this second copy does not change the copy that is in the box.
In Example 3, a similar problem occurs, but in reverse. Instead of calling MoveTo() directly, the value is cast to IAngle. The conversion to an interface type boxes the value, so the runtime copies the data in angle to the heap and provides a reference to that box. Next, the method call modifies the value in the referenced box. The value stored in variable angle remains unmodified.
In the last case, the cast to IAngle is a reference conversion, not a boxing conversion. The value has already been boxed by the conversion to object in this case, so no copy of the value occurs on this conversion. The call to MoveTo() updates the _Degrees value stored in the box, and the code behaves as desired.
As you can see from this example, mutable value types are quite confusing because it is often unclear when you are mutating a copy of the value rather than the storage location you actually intend to change. By avoiding mutable value types in the first place, you can eliminate this sort of confusion.
Anytime a method is called on a value type, the value type receiving the call (represented by this in the body of the method) must be a variable, not a value, because the method might be trying to mutate the receiver. Clearly, it must be mutating the receiver’s storage location, rather than mutating a copy of the receiver’s value and then discarding it. Examples 2 and 4 of Listing 9.11 illustrate how this fact affects the performance of a method invocation on a boxed value type.
In Example 2, the unboxing conversion logically produces the boxed value, not a reference to the storage location on the heap that contains the boxed copy. Which storage location, then, is passed via this to the mutating method call? It cannot be the storage location from the box on the heap, because the unboxing conversion produces a copy of that value, not a reference to that storage location.
When this situation arises—a variable of a value type is required but only a value is available—one of two things happens: either the C# compiler generates code that makes a new, temporary storage location and copies the value from the box into the new location, resulting in the temporary storage location becoming the needed variable, or the compiler produces an error and disallows the operation. In this case, the former strategy is used. The new temporary storage location is then the receiver of the call; after it is mutated, the temporary storage location is discarded.
This process—performing a type check of the boxed value, unboxing to produce the storage location of the boxed value, allocating a temporary variable, copying the value from the box to the temporary variable, and then calling the method with the location of the temporary storage—happens every time you use the unbox-and-then-call pattern, regardless of whether the method actually mutates the variable. Clearly, if it does not mutate the variable, some of this work could be avoided. Because the C# compiler does not know whether any particular method you call will try to mutate the receiver, it must err on the side of caution.
These expenses are all eliminated when calling an interface method on a boxed value type. In such a case, the expectation is that the receiver will be the storage location in the box; if the interface method mutates the storage location, it is the boxed location that should be mutated. Therefore, the expense of performing a type check, allocating new temporary storage, and making a copy is avoided. Instead, the runtime simply uses the storage location in the box as the receiver of the call to the struct’s method.
In Listing 9.28, we call the two-argument version of ToString() that is found on the IFormattable interface, which is implemented by the int value type. In this example, the receiver of the call is a boxed value type, but it is not unboxed to make the call to the interface method.
Now suppose that we had instead called the virtual ToString() method declared by object with an instance of a value type as the receiver. What happens then? Is the instance boxed, unboxed, or something else? A number of different scenarios are possible depending on the details:
________________________________________