Multithreaded programming includes the following complexities:
Furthermore, anytime a method is long running, multithreaded programming will probably be required—that is, invoking the long-running method asynchronously.5
However, as developers wrote more multithreaded code, a common set of scenarios and programming patterns for handling those scenarios emerged. And, to simplify the programming model, a new threading type System.Threading.Tasks.Task was introduced,6 which greatly enhanced the programmability of one such pattern—TAP7—by leveraging the TPL8 from .NET 4.0 and enhancing the C# language with new constructs to support it. This and the following section delve into the details of the TPL on its own and then the TPL with the async/await contextual keywords that simplify TAP programming.
Creating a thread is a relatively expensive operation, and each thread consumes a large amount (1 megabyte, by default, on Windows, for example) of virtual memory. It is likely more efficient to use a thread pool to allocate threads when needed, assign asynchronous work to the thread, run the work to completion, and then reuse the thread for subsequent asynchronous work rather than destroying the thread when the work is complete and creating a new one later.
In .NET Framework 4 and later, instead of creating an operating system thread each time asynchronous work is started, the TPL creates a Task and tells the task scheduler that there is asynchronous work to perform. A task scheduler might use many different strategies to fulfill this purpose, but by default it requests a worker thread from the thread pool. The thread pool might decide that it is more efficient to run the task later, after some currently executing tasks have completed, or it might decide to schedule the task’s worker thread on a particular processor. The thread pool determines whether it is more efficient to create an entirely new thread or to reuse an existing thread that previously finished executing.
By abstracting the concept of asynchronous work into the Task object, the TPL provides an object that represents asynchronous work and provides an object-oriented API for interacting with that work. Moreover, by providing an object that represents the unit of work, the TPL enables programmatically building up workflows by composing small tasks into larger ones, as we’ll see.
A task is an object that encapsulates work that executes asynchronously. This should sound familiar: A delegate is also an object that represents code. The difference between a task and a delegate is that delegates are synchronous and tasks are asynchronous. Executing a delegate—say, an Action—immediately transfers the point of control of the current thread to the delegate’s code; control does not return to the caller until the delegate is finished. By contrast, starting a task almost immediately returns control to the caller, no matter how much work the task must perform. The task executes asynchronously, typically on another thread (though, as we will discuss in Chapter 20, it is possible and even beneficial to execute tasks asynchronously with only one thread). A task essentially transforms a delegate from a synchronous to an asynchronous execution pattern.
You know when a delegate is done executing on the current thread because the caller cannot do anything until the delegate is done. But how do you know when a task is done, and how do you get the result, if there is one? Consider the example of turning a synchronous delegate into an asynchronous task. The worker thread writes hyphens to the console, while the main thread writes plus signs.
Starting the task obtains a thread from the thread pool, creating a second point of control, and executes the delegate on that thread. As shown in Listing 19.1, the point of control on the main thread continues normally after the call to start the task (Task.Run()).
The code that is to run in a new thread is defined in the delegate (of type Action in this case) passed to the Task.Run() method. This delegate (in the form of a lambda expression) prints out hyphens to the console repeatedly. The loop that follows the starting of the task is almost identical, except that it displays plus signs.
Notice that following the call to Task.Run(), the Action passed as the argument immediately starts executing. The Task is said to be “hot,” meaning that it has already been triggered to start executing—as opposed to a “cold” task, which needs to be explicitly started before the asynchronous work begins.
Although a Task can also be instantiated in a cold state via the Task constructor, doing so is generally appropriate only as an implementation detail internal to an API that returns an already running (hot) Task, one triggered by a call to Task.Start().
Notice that the exact state of a hot task is indeterminate immediately following the call to Run(). The behavior is determined by a combination of the operating system, its load, and the accompanying task library. The combination determines whether Run() chooses to execute the task’s worker thread immediately or delay it until additional resources are available. In fact, it is possible that the hot task may have already finished by the time the code on the calling thread gets its turn to execute again. The call to Wait() forces the main thread to wait until all the work assigned to the task has completed executing.
In this scenario, we have a single task, but it is also possible for many tasks to be running asynchronously. It is common to have a set of tasks where you want to wait for all of them to complete, or for any one of them to complete, before continuing execution of the current thread. The Task.WaitAll() and Task.WaitAny() methods, respectively, do just that.
So far, we’ve seen how a task can take an Action and run it asynchronously. But what if the work executed in the task returns a result? We can use the Task<T> type to run a Func<T> asynchronously. When executing a delegate synchronously, we know that control will not return until the result is available. When executing a Task<T> asynchronously, we can poll it from one thread to see if it is done, and fetch the result when it is.9 Listing 19.2 demonstrates how to do so in a console application. Note that this sample uses a PiCalculator.Calculate() method that we will delve into further in the section “Executing Loop Iterations in Parallel” in Chapter 21.
This listing shows that the data type of the task is Task<string>. The generic type includes a Result property from which to retrieve the value returned by the Func<string> that the Task<string> executes.
Note that Listing 19.2 does not make a call to Wait(). Instead, reading from the Result property automatically causes the current thread to block until the result is available, if it isn’t already; in this case, we know that it will already be complete when the result is fetched.
In addition to the IsCompleted and Result properties on Task<T>, several others are worth noting:
We discuss other useful properties later in this chapter under “Canceling a Task.”
We’ve talked several times about the control flow of a program without ever saying what the most fundamental nature of control flow is: Control flow determines what happens next. When you have a simple control flow like Console.WriteLine(x.ToString());, the control flow tells you that when ToString completes normally, the next thing that will happen is a call to WriteLine with the value returned as the argument. The concept of “what happens next” is called continuation; each point in a control flow has a continuation. In our example, the continuation of ToString is WriteLine (and the continuation of WriteLine is whatever code runs in the next statement). The idea of continuation is so elementary to C# programming that most programmers don’t even think about it; it’s part of the invisible air that they breathe. The act of C# programming is the act of constructing continuation upon continuation until the control flow of the entire program is complete.
Notice that the continuation of a given piece of code in a normal C# program will be executed immediately upon the completion of that code. When ToString() returns, the point of control on the current thread immediately does a synchronous call to WriteLine. Notice also that there are actually two possible continuations of a given piece of code: the normal continuation and the exceptional continuation that will be executed if the current piece of code throws an exception.
Asynchronous method calls, such as starting a Task, add an additional dimension to the control flow. With an asynchronous Task invocation, the control flow goes immediately to the statement after the Task.Start(), while at the same time, it begins executing within the body of the Task delegate. In other words, what happens next when asynchrony is involved is multidimensional. Unlike with exceptions, where the continuation is just a different path, continuation is an additional, parallel path with asynchrony.
Asynchronous tasks also allow composition of larger tasks out of smaller tasks by describing asynchronous continuations. Just as with regular control flow, a task can have different continuations to handle error situations, and tasks can be melded together by manipulating their continuations. There are several techniques for doing so, the most explicit of which is the ContinueWith() method (see Listing 19.3 and its corresponding output, Output 19.1).
The ContinueWith() method enables “chaining” two tasks together, such that when the predecessor task—the antecedent task—completes, the second task—the continuation task—is automatically started asynchronously. In Listing 19.3, for example, Console.WriteLine("Starting...") is the antecedent task body and Console.WriteLine("Continuing A...") is its continuation task body. The continuation task takes a Task as its argument (antecedent), thereby allowing the continuation task’s code to access the antecedent task’s completion state. When the antecedent task is completed, the continuation task starts automatically, asynchronously executing the second delegate and passing the just-completed antecedent task as an argument to that delegate. Furthermore, since the ContinueWith() method returns a Task as well, that Task can be used as the antecedent of yet another Task, and so on, forming a continuation chain of Tasks that can be arbitrarily long.
If you call ContinueWith() twice on the same antecedent task (as Listing 19.3 shows with taskB and taskC representing continuation tasks for taskA), the antecedent task (taskA) has two continuation tasks, and when the antecedent task completes, both continuation tasks will be executed asynchronously. Notice that the order of execution of the continuation tasks from a single antecedent is indeterminate at compile time. Output 19.1 happens to show taskC executing before taskB, but in a second execution of the program, the order might be reversed. However, taskA will always execute before taskB and taskC because the latter are continuation tasks of taskA and therefore can’t start before taskA completes. Similarly, the Console.WriteLine("Starting...") delegate will always execute to completion before taskA (Console.WriteLine("Continuing A...")) because the latter is a continuation task of the former. Furthermore, Finished! will always appear last because of the call to Task.WaitAll(taskB, taskC) that blocks the control flow from continuing until both taskB and taskC complete.
Many different overloads of ContinueWith() are possible, and some of them take a TaskContinuationOptions value to tweak the behavior of the continuation chain. These values are flags, so they can be combined using the logical OR operator (|). A brief description of some of the possible flag values appears in Table 19.1; see the online documentation11 for more details.
Enum |
Description |
None |
This is the default behavior. The continuation task will be executed when the antecedent task completes, regardless of its task status. |
PreferFairness |
If two tasks were both asynchronously started, one before the other, there is no guarantee that the one that was started first actually gets to run first. This flag asks the task scheduler to try to increase the likelihood that the first task started is the first task to execute—something that is particularly relevant when the two tasks you describe are created from different thread pool threads. |
LongRunning |
This tells the task scheduler that the task is likely to be an I/O-bound high-latency task. The scheduler can then allow other queued work to be processed rather than starved because of the long-running task. This option should be used sparingly. |
AttachedToParent |
This specifies that a task should attempt to attach to a parent task within the task hierarchy. |
DenyChildAttach (.NET 4.5) |
This throws an exception if creation of a child task is attempted. If code within the continuation tries to use AttachedToParent, it will behave as if there was no parent. |
NotOnRanToCompletion* |
This specifies that the continuation task should not be scheduled if its antecedent ran to completion. This option is not valid for multitask continuations. |
NotOnFaulted* |
This specifies that the continuation task should not be scheduled if its antecedent threw an unhandled exception. This option is not valid for multitask continuations. |
OnlyOnCanceled* |
This specifies that the continuation task should be scheduled only if its antecedent was canceled. This option is not valid for multitask continuations. |
NotOnCanceled* |
This specifies that the continuation task should not be scheduled if its antecedent was canceled. This option is not valid for multitask continuations. |
OnlyOnFaulted* |
This specifies that the continuation task should be scheduled only if its antecedent threw an unhandled exception. This option is not valid for multitask continuations. |
OnlyOnRanToCompletion* |
This specifies that the continuation task should be scheduled only if its antecedent ran to completion. This option is not valid for multitask continuations. |
ExecuteSynchronously |
This specifies that the continuation task should be executed synchronously. With this option specified, the continuation that the schedule will attempt to execute the work on is the same thread that causes the antecedent task to transition into its final state. If the antecedent is already complete when the continuation is created, the continuation will run on the thread creating the continuation. |
HideScheduler (.NET 4.5) |
This prevents the ambient scheduler from being seen as the current scheduler in the created task. This means that operations like Run/StartNew and ContinueWith that are performed in the created task will see TaskScheduler.Default (null) as the current scheduler. This is useful when continuation should run on a particular scheduler, but the continuation is calling out to additional code that should not schedule work on the same scheduler. |
LazyCancellation (.NET 4.5) |
This causes the continuation to delay monitoring the supplied cancellation token for a cancellation request until the antecedent has completed. Consider tasks t1, t2, and t3, where the latter is a continuation of the former. If t2 is canceled before t1 completes, it is possible that t3 could start before t1 completes. Setting LazyCancellation avoids this. |
RunContinuationsAsynchronously (.NET 4.6) |
When a task is created with the RunContinuationsAsynchronously option, that tells the task that it should force its continuations to run asynchronously. Even if the task is itself a continuation, this option does not affect how that task is run—only how continuations from it are run. A continuation task can be created with both TaskContinuationOptions.ExecuteSynchronously and TaskContinuationOptions.RunContinuationsAsynchronously. The former causes the continuation to execute synchronously when its antecedent completes and causes the continuation’s continuations to run asynchronously when the continuation completes. |
In Table 19.1, the items denoted with a star (*) indicate under which conditions the continuation task will be executed; thus they are particularly useful for creating continuations that act like event handlers for the antecedent task’s behavior. Listing 19.4 demonstrates how an antecedent task can be given multiple continuations that execute conditionally, depending on how the antecedent task completed.
In this listing, we effectively register listeners for events on the antecedent’s task so that when the task completes normally or abnormally, the particular “listening” task will begin executing. This is a powerful capability, particularly if the original task is a fire-and-forget task—that is, a task that we start, hook up to continuation tasks, and then never refer to again.
In Listing 19.4, notice that the final Wait() call is on completedTask, not on task—the original antecedent task created with Task.Run(). Although each delegate’s antecedentTask is a reference to the antecedent task (task), from outside the delegate listeners we can effectively discard the reference to the original task. We can then rely solely on the continuation tasks that begin executing asynchronously without any need for follow-up code that checks the status of the original task.
In this case, we call completedTask.Wait() so that the main thread does not exit the program before the completed output appears (see Output 19.2).
In this case, invoking completedTask.Wait() is somewhat contrived because we know that the original task will complete successfully. However, invoking Wait() on canceledTask or faultedTask will result in an exception. Those continuation tasks run only if the antecedent task is canceled or throws an exception; given that will not happen in this program, those tasks will never be scheduled to run, and waiting for them to complete would throw an exception. The continuation options in Listing 19.1 happen to be mutually exclusive, so when the antecedent task runs to completion and the task associated with completedTask executes, the task scheduler automatically cancels the tasks associated with canceledTask and faultedTask. The canceled tasks end with their state set to Canceled. Therefore, calling Wait() (or any other invocation that would cause the current thread to wait for a task completion) on either of these tasks will throw an exception indicating that they are canceled. A less contrived approach might be to call Task.WaitAny(completedTask, canceledTask, faultedTask), which will throw an AggregateException that then needs to be handled.
When calling a method synchronously, we can wrap it in a try block with a catch clause to identify to the compiler which code we want to execute when an exception occurs. This does not work with an asynchronous call, however. We cannot simply wrap a try block around a call to Start() to catch an exception, because control immediately returns from the call, and control will then leave the try block, possibly long before the exception occurs on the worker thread. One solution is to wrap the body of the task delegate with a try/catch block. Exceptions thrown on and subsequently caught by the worker thread will consequently not present problems, as a try block will work normally on the worker thread. This is not the case, however, for unhandled exceptions—those that the worker thread does not catch.
Generally (starting with version 2.012 of the CLR), unhandled exceptions on any thread are treated as fatal, trigger the operating system error reporting dialog, and cause the application to terminate abnormally. All exceptions on all threads must be caught; if they are not, the application is not allowed to continue to run. (For some advanced techniques for dealing with unhandled exceptions, see the upcoming “Advanced Topic: Dealing with Unhandled Exceptions on a Thread.”) Fortunately, this is not the case for unhandled exceptions in an asynchronously running task. In such a case, the task scheduler inserts a catchall exception handler around the delegate so that if the task throws an otherwise unhandled exception, the catchall handler will catch it and record the details of the exception in the task, avoiding any trigger of the CLR automatically terminating the process.
As we saw in Listing 19.4, one technique for dealing with a faulted task is to explicitly create a continuation task that is the fault handler for that task; the task scheduler will automatically schedule the continuation when it detects that the antecedent task threw an unhandled exception. If no such handler is present, however, and Wait() (or an attempt to get the Result) executes on a faulted task, an AggregateException will be thrown (see Listing 19.5 and Output 19.3).
The aggregate exception is so called because it may contain many exceptions collected from one or more faulted tasks. Imagine, for example, asynchronously executing ten tasks in parallel and five of them throwing exceptions. To report all five exceptions and have them handled in a single catch block, the framework uses the AggregateException as a means of collecting the exceptions and reporting them as a single exception. Furthermore, since it is unknown at compile time whether a worker task will throw one or more exceptions, an unhandled faulted task will always throw an AggregateException. Listing 19.5 and Output 19.3 demonstrate this behavior. Even though the unhandled exception thrown on the worker thread was of type InvalidOperationException, the type of the exception caught on the main thread is still an AggregateException. Also, as expected, to catch the exception requires an AggregateException catch block.
A list of the exceptions contained within an AggregateException is available from the InnerExceptions property. As a result, you can iterate over this property to examine each exception and determine the appropriate course of action. Alternatively, and as shown in Listing 19.5, you can use the AggregateException.Handle() method, specifying an expression to execute against each individual exception contained within the AggregateException. One important characteristic of the Handle() method to consider, however, is that it is a predicate. As such, the predicate should return true for any exceptions that the Handle() delegate successfully addresses. If any exception handling invocation returns false for an exception, the Handle() method will throw a new AggregateException that contains the composite list of such corresponding exceptions.
You can also observe the state of a faulted task without causing the exception to be rethrown on the current thread by simply looking at the Exception property of the task. Listing 19.6 demonstrates this approach by waiting for the completion of a fault continuation of a task13 that we know will throw an exception.
Notice that to retrieve the unhandled exception on the original task, we use the Exception property (as well as dereferencing with the null-forgiveness operator, because we know the value will not be null). The result is output identical to Output 19.3.
If an exception that occurs within a task goes entirely unobserved—that is, (1) it isn’t caught from within the task; (2) the completion of the task is never observed, via Wait(), Result, or accessing the Exception property, for example; and (3) the faulted ContinueWith() is never observed—then the exception is likely to go unhandled entirely, resulting in a process-wide unhandled exception. In .NET 4.0, such a faulted task would get rethrown by the finalizer thread and likely crash the process. In contrast, in .NET 4.5, the crashing has been suppressed (although the CLR can be configured for the crashing behavior if preferred).
In either case, you can register for an unhandled task exception via the TaskScheduler.UnobservedTaskException event.
As we discussed earlier, an unhandled exception on any thread by default causes the application to shut down. An unhandled exception is a fatal, unexpected bug, and the exception may have occurred because a crucial data structure is corrupt. Since you have no idea what the program could possibly be doing, the safest thing to do is to shut down the whole thing immediately.
Ideally, no programs would ever throw unhandled exceptions on any thread; programs that do so have bugs, and the best course of action is to find and fix the bug before the software is shipped to customers. However, rather than shutting down an application as soon as possible when an unhandled exception occurs, it is often desirable to save any working data and/or log the exception for error reporting and future debugging. This requires a mechanism to register notifications of unhandled exceptions.
With both the Microsoft .NET Framework and .NET Core 2.0 (or later), every AppDomain provides such a mechanism, and to observe the unhandled exceptions that occur in an AppDomain, you must add a handler to the UnhandledException event. The UnhandledException event will fire for all unhandled exceptions on threads within the application domain, whether it is the main thread or a worker thread. Note that the purpose of this mechanism is notification; it does not permit the application to recover from the unhandled exception and continue executing. After the event handlers run, the application will display the operating system’s error reporting dialog, and then the application will exit. (For console applications, the exception details will also appear on the console.)
In Listing 19.7, we show how to create a second thread that throws an exception, which is then handled by the application domain’s unhandled exception event handler. For demonstration purposes, to ensure that thread timing issues do not come into play, we insert some artificial delays using Thread.Sleep. Output 19.4 shows the results.
As you can see in Output 19.4, the new thread is assigned thread ID 3 and the main thread is assigned thread ID 1. The operating system schedules thread 3 to run for a while; it throws an unhandled exception, the event handler is invoked, and it goes to sleep. Soon thereafter, the operating system realizes that thread 1 can be scheduled, but its code immediately puts it to sleep. Thread 1 wakes up first and runs the finally block, and then 2 seconds later thread 3 wakes up, and the unhandled exception finally crashes the process.
This sequence of events—the event handler executing, and the process crashing after it is finished—is typical but not guaranteed. The moment there is an unhandled exception in your program, all bets are off; the program is now in an unknown and potentially very unstable state, so its behavior can be unpredictable. In this case, as you can see, the CLR allows the main thread to continue running and executing its finally block, even though it knows that by the time control gets to the finally block, another thread is in the AppDomain’s unhandled exception event handler.
To emphasize this fact, try changing the delays so that the main thread sleeps longer than the event handler. In that scenario, the finally block will never execute! The process will be destroyed by the unhandled exception before thread 1 wakes up. You can also get different results depending on whether the exception-throwing thread is or is not created by the thread pool. The best practice, therefore, is to avoid all possible unhandled exceptions, whether they occur in worker threads or in the main thread.
How does this pertain to tasks? What if there are unfinished tasks hanging around the system when you want to shut it down? We look at task cancellation in the next section.
________________________________________