Platform Interoperability and Unsafe Code
C# has great capabilities, especially when you consider that the underlying framework is entirely managed. Sometimes, however, you need to escape out of all the safety that C# provides and step back into the world of memory addresses and pointers. C# supports this action in two significant ways. The first option is to go through Platform Invoke (P/Invoke) and calls into APIs exposed by unmanaged dynamic link libraries (DLLs). The second way is through unsafe code, which enables access to memory pointers and addresses.
The majority of the chapter discusses interoperability with unmanaged code and the use of unsafe code. This discussion culminates with a small program that determines the processor ID of a computer. The code requires that you do the following:
Aside from the P/Invoke and unsafe constructs covered here, the complete listing demonstrates the full power of C# and the fact that the capabilities of unmanaged code are still accessible from C# and managed code.
Whether a developer is trying to call a library of existing unmanaged code, accessing unmanaged code in the operating system not exposed in any managed API, or trying to achieve maximum performance for an algorithm by avoiding the runtime overhead of type checking and garbage collection, at some point there must be a call into unmanaged code. The Common Language Infrastructure (CLI) provides this capability through P/Invoke. With P/Invoke, you can make API calls into exported functions of unmanaged DLLs.
The APIs invoked in this section are Windows APIs. Although the same APIs are not available on other platforms, developers can still use P/Invoke for APIs native to their operating systems or for calls into their own DLLs. The guidelines and syntax are the same.
Once the target function is identified, the next step of P/Invoke is to declare the function with managed code. Just as with all regular methods that belong to a class, you need to declare the targeted API within the context of a class, but by using the extern modifier. Listing 23.1 demonstrates how to do this.
In this case, the class is VirtualMemoryManager, because it will contain functions associated with managing memory. (This particular function is available directly off the System.Diagnostics.Processor class, so there is no need to declare it in real code.) Note that the method returns an IntPtr; this type is explained in the next section.
The extern methods never include any body and are (almost) always static. Instead of a method body, the DllImport attribute, which accompanies the method declaration, points to the implementation. At a minimum, the attribute needs the name of the DLL that defines the function. The runtime determines the function name from the method name, although you can override this default by using the EntryPoint named parameter to provide the function name. (The .NET framework will automatically attempt calls to the Unicode [...W] or ASCII [...A] API version.)
In this case, the external function, GetCurrentProcess(), retrieves a pseudohandle for the current process that you will use in the call for virtual memory allocation. Here’s the unmanaged declaration:
HANDLE GetCurrentProcess();
Assuming the developer has identified the targeted DLL and exported function, the most difficult step is identifying or creating the managed data types that correspond to the unmanaged types in the external function.1 Listing 23.2 shows a more difficult API.
VirtualAllocEx() allocates virtual memory that the operating system specifically designates for execution or data. To call it, you need corresponding definitions in managed code for each data type; although common in Win32 programming, HANDLE, LPVOID, SIZE_T, and DWORD are undefined in the CLI managed code. The declaration in C# for VirtualAllocEx(), therefore, is shown in Listing 23.3.
One distinct characteristic of managed code is that primitive data types such as int do not change their size on the basis of the processor. Whether the processor is 32 or 64 bits, int is always 32 bits. In unmanaged code, however, memory pointers will vary depending on the processor. Therefore, instead of mapping types such as HANDLE and LPVOID simply to ints, you need to map to System.IntPtr, whose size will vary depending on the processor memory layout. This example also uses an AllocationType enum, which we discuss in the section “Simplifying API Calls with Wrappers” later in this chapter.
An interesting point to note about Listing 23.3 is that IntPtr is useful for more than just pointers—that is, it is useful for other things such as quantities. IntPtr does not mean just “pointer stored in an integer”; it also means “integer that is the size of a pointer.” An IntPtr need not contain a pointer but simply needs to contain something the size of a pointer. Lots of things are the size of a pointer but are not actually pointers.
C# 9.0 introduced new contextual keywords, nint and nunit that represent native machine-size integers. These are both signed integers that will either be 32 or 64 bits depending on the running process. Internally nint and nuint are implemented with System.IntPtr and System.UIntPtr. In C# 11 these are updated to simply be aliases for the types they represent.
Frequently, unmanaged code uses pointers for pass-by-reference parameters. In these cases, P/Invoke doesn’t require that you map the data type to a pointer in managed code. Instead, you map the corresponding parameters to ref (or out, depending on whether the parameter is in/out or just out). In Listing 23.4, lpflOldProtect, whose data type is PDWORD, returns the “pointer to a variable that receives the previous access protection of the first page in the specified region of pages.”2
Although lpflOldProtect is documented as [out] (even though the signature doesn’t enforce it), the description also mentions that the parameter must point to a valid variable and not NULL. This inconsistency is confusing but commonly encountered. The guideline is to use ref rather than out for P/Invoke type parameters, since the callee can always ignore the data passed with ref, but the converse will not necessarily succeed.
The other parameters are virtually the same as VirtualAllocEx() except that lpAddress is the address returned from VirtualAllocEx(). In addition, flNewProtect specifies the exact type of memory protection: page execute, page read-only, and so on.
Some APIs involve types that have no corresponding managed type. Calling these types requires redeclaration of the type in managed code. You declare the unmanaged COLORREF struct, for example, in managed code (see Listing 23.5).
Various Microsoft Windows color APIs use COLORREF to represent RGB colors (i.e., levels of red, green, and blue).
The key in the Listing 23.5 declaration is StructLayoutAttribute. By default, managed code can optimize the memory layouts of types, so layouts may not be sequential from one field to the next. To force sequential layouts so that a type maps directly and can be copied bit for bit (blitted) from managed to unmanaged code, and vice versa, you add the StructLayoutAttribute with the LayoutKind.Sequential enum value. (This is also useful when writing data to and from filestreams where a sequential layout may be expected.)
Since the unmanaged (C++) definition for struct does not map to the C# definition, there is no direct mapping of unmanaged struct to managed struct. Instead, developers should follow the usual C# guidelines about whether the type should behave like a value or a reference type, and whether the size is small (approximately less than 16 bytes).
Often it can be helpful to declare struct types as buffers. With C# 12.0 structs can be decorated so that they can be used as a fixed size array. Inline arrays are implicitly convertible to Span<T> or ReadOnlySpan<T> as an easy way to interact with the inline array. You can also directly access the elements in the inline array with a familiar indexer syntax, including ranges and index for reading and writing elements. To create an inline array, declare a struct with a single field and decorated with the System.Runtime.CompilerServices.InlineArrayAttribute specifying the size of the inline array in the attribute.
Since C# 1.0, local variables have been initialized to zero values. However, in some high-performance situations, these zero initialized values are then overridden. C# 9.0 introduced a new SkipLocalsInitAttribute. This attribute will result in the compiler not outputting the localsinit CIL flag. The result is that locals may not be zero initialized. Accessing uninitialized data is discouraged because the behavior is undefined.
One inconvenient aspect of Win32 API programming is the fact that the APIs frequently report errors in inconsistent ways. For example, some APIs return a value (0, 1, false, and so on) to indicate an error, whereas others set an out parameter in some way. Furthermore, the details of what went wrong require additional calls to the GetLastError() API and then an additional call to FormatMessage() to retrieve an error message corresponding to the error. In summary, Win32 error reporting in unmanaged code seldom occurs via exceptions.
Fortunately, the P/Invoke designers provided a mechanism for error handling. To enable it, if the SetLastError named parameter of the DllImport attribute is true, it is possible to instantiate a System.ComponentModel.Win32Exception() that is automatically initialized with the Win32 error data immediately following the P/Invoke call (see Listing 23.6).
This code enables developers to provide the custom error checking that each API uses while still reporting the error in a standard manner.
Listing 23.1 and Listing 23.3 declared the P/Invoke methods as internal or private. Except for the simplest of APIs, wrapping methods in public wrappers that reduce the complexity of the P/Invoke API calls is a good guideline that increases API usability and moves toward object-oriented type structure. The AllocExecutionBlock() declaration in Listing 23.6 provides a good example of this approach.
Frequently, P/Invoke involves a resource, such as a handle, that code needs to clean up after using. Instead of requiring developers to remember this step is necessary and manually code it each time, it is helpful to provide a class that implements IDisposable and a finalizer. In Listing 23.7, for example, the address returned after VirtualAllocEx() and VirtualProtectEx() requires a follow-up call to VirtualFreeEx(). To provide built-in support for this process, you define a VirtualMemoryPtr class that derives from System.Runtime.InteropServices.SafeHandle.
System.Runtime.InteropServices.SafeHandle includes the abstract members IsInvalid and ReleaseHandle(). You place your cleanup code in the latter; the former indicates whether this code has executed yet.
With VirtualMemoryPtr, you can allocate memory simply by instantiating the type and specifying the needed memory allocation.
Once you declare the P/Invoke functions, you invoke them just as you would any other class member. The key, however, is that the imported DLL must be in the path, including the executable directory, so that it can be successfully loaded. Listing 23.6 and Listing 23.7 demonstrate this approach. However, they rely on some constants.
Since flAllocationType and flProtect are flags, it is a good practice to provide constants or enums for each. Instead of expecting the caller to define these constants or enums, encapsulation suggests that you provide them as part of the API declaration, as shown in Listing 23.8.
The advantage of enums is that they group together the various values. Furthermore, they can limit the scope to nothing else besides these values.
Whether they are focused on error handling, structs, or constant values, one goal of effective API developers is to provide a simplified managed API that wraps the underlying Win32 API. For example, Listing 23.9 overloads VirtualFreeEx() with public versions that simplify the call.
One last key point related to P/Invoke is that function pointers in unmanaged code map to delegates in managed code. To set up a timer, for example, you would provide a function pointer that the timer could call back on, once it had expired. Specifically, you would pass a delegate instance that matches the signature of the callback.
C# 11 introduced an additional construct for storing pointers to functions. These function pointers can provide a performance improvement over using delegates to invoke methods. These are only available in unsafe code which is covered in more detail later in this chapter.
Given the idiosyncrasies of P/Invoke, there are several guidelines to aid in the process of writing such code.
________________________________________