By definition, a collection within .NET is a class that, at a minimum, implements IEnumerable. This interface is critical because implementing the methods of IEnumerable is the minimum needed to support iterating over the collection.
Chapter 4 showed how to use a foreach statement to iterate over an array of elements. This syntax is simple and avoids the complication of having to know how many elements there are. The runtime does not directly support the foreach statement, however. Instead, the C# compiler transforms the code as described in this section.
Listing 15.3 demonstrates a simple foreach loop iterating over an array of integers and then printing out each integer to the console.
From this code, the C# compiler creates a CIL equivalent of the for loop, as shown in Listing 15.4.
In this example, note that foreach relies on support for the Length property and the index operator ([]). With the Length property, the C# compiler can use the for statement to iterate through each element in the array.
Although the code shown in Listing 15.4 works well on arrays where the length is fixed and the index operator is always supported, not all types of collections have a known number of elements. Furthermore, many of the collection classes, including the Stack<T>, Queue<T>, and Dictionary<TKey, TValue> classes, do not support retrieving elements by index. Therefore, a more general approach of iterating over collections of elements is needed. The iterator pattern provides this capability. Assuming you can determine the first and next elements, knowing the count and supporting retrieval of elements by index are unnecessary.
The System.Collections.Generic.IEnumerator<T> and nongeneric System.Collections.IEnumerator interfaces are designed to enable the iterator pattern for iterating over collections of elements, rather than the length–index pattern shown in Listing 15.4. A class diagram of their relationships appears in Figure 15.1.
IEnumerator, which IEnumerator<T> derives from, includes three members. The first is bool MoveNext(). Using this method, you can move from one element within the collection to the next, while at the same time detecting when you have enumerated through every item. The second member, a read-only property called Current, returns the element currently in process. Current is overloaded in IEnumerator<T>, providing a type-specific implementation of it. With these two members of the collection class, it is possible to iterate over the collection by simply using a while loop, as demonstrated in Listing 15.5. (The Reset() method usually throws a NotImplementedException, so it should never be called. If you need to restart an enumeration, just create a fresh enumerator.)
In Listing 15.5, the MoveNext() method returns false when it moves past the end of the collection. This replaces the need to count elements while looping.
Listing 15.5 uses a System.Collections.Generic.Stack<T> as the collection type. Numerous other collection types exist; this is just one example. The key trait of Stack<T> is its design as a last in, first out (LIFO) collection. Notice that the type parameter T identifies the type of all items within the collection. Collecting one type of object within a collection is a key characteristic of a generic collection. The programmer must know the data type within the collection when adding, removing, or accessing items within the collection.
The preceding example shows the gist of the C# compiler output, but it doesn’t actually compile that way because it omits two important details concerning the implementation: interleaving and error handling.
The problem with an implementation such as Listing 15.5 is that if two such loops interleaved each other—one foreach inside another, both using the same collection—the collection must maintain a state indicator of the current element so that when MoveNext() is called, the next element can be determined. In such a case, one interleaving loop can affect the other. (The same is true of loops executed by multiple threads.)
To overcome this problem, the collection classes do not support IEnumerator<T> and IEnumerator interfaces directly. Instead, as shown in Figure 15.1, there is a second interface, called IEnumerable<T>, whose only method is GetEnumerator(). The purpose of this method is to return an object that supports IEnumerator<T>. Instead of the collection class maintaining the state, a different class—usually a nested class, so that it has access to the internals of the collection—will support the IEnumerator<T> interface and will keep the state of the iteration loop. The enumerator is like a “cursor” or a “bookmark” in the sequence. You can have multiple bookmarks, and moving each of them enumerates over the collection independently of the others. Using this pattern, the C# equivalent of a foreach loop will look like the code shown in Listing 15.6.
Given that the classes that implement the IEnumerator<T> interface maintain the state, sometimes you need to clean up the state after it exits the loop (because either all iterations have completed or an exception is thrown). To achieve this, the IEnumerator<T> interface derives from IDisposable. Enumerators that implement IEnumerator do not necessarily implement IDisposable, but if they do, Dispose() will be called as well. This means that Dispose() can be called after the foreach loop exits. The C# equivalent of the final Common Intermediate Language (CIL) code, therefore, looks like Listing 15.7.
Notice that because the IDisposable interface is supported by IEnumerator<T>, the using statement can simplify the code in Listing 15.7 to that shown in Listing 15.8.
However, recall that the CIL does not directly support the using keyword. For this reason, the code in Listing 15.7 is actually a more accurate C# representation of the foreach CIL code.
C# doesn’t require that IEnumerable/IEnumerable<T> be implemented to iterate over a data type using foreach. Rather, the compiler uses a concept known as duck typing: It looks for a GetEnumerator() method that returns a type with a Current property and a MoveNext() method. Duck typing involves searching by name rather than relying on an interface or explicit method call to the method. (The name “duck typing” comes from the whimsical idea that to be treated as a duck, the object must merely implement a Quack() method; it need not implement an IDuck interface.) If duck typing fails to find a suitable implementation of the enumerable pattern, the compiler checks whether the collection implements the interfaces. Furthermore, even if neither the method nor the interface exist, starting in C# 9.0, the compiler will look for an extension method that implements the GetEnumerator() signature, and will use that if it is available.
Chapter 4 showed that the compiler prevents assignment of the foreach variable (number). As is demonstrated in Listing 15.7, an assignment to number would not change the collection element itself, so the C# compiler prevents such an assignment altogether.
In addition, neither the element count within a collection nor the items themselves can generally be modified during the execution of a foreach loop. If, for example, you called stack.Push(42) inside the foreach loop, it would be ambiguous whether the iterator should ignore or incorporate the change to stack—in other words, whether iterator should iterate over the newly added item or ignore it and assume the same state as when it was instantiated.
Because of this ambiguity, an exception of type System.InvalidOperationException is generally thrown upon accessing the enumerator if the collection is modified within a foreach loop. This exception reports that the collection was modified after the enumerator was instantiated.