
When programming with any language, it’s important to know the full suite of tools, their caveats, and their hidden benefits. A tool in C# that often goes overlooked are enumerables, and their spouse the iterators (called enumerators in C#). These tools are incredibly powerful and can make code more concise and flexible, but can also end up causing some serious problems down the line.
First, lets establish some terms:
Enumerables
As the name implies, these are objects which can be enumerated. This means that it contains a collection or a series of objects. These are very similar to an enum
which is a constant definition of objects of a uniform type.
Tip: enum
types do not implement IEnumerable
but you can get one with Enum.GetValues(typeof(MyEnumType))
Enumerables in C# are represented with the interface IEnumerable
and IEnumerable<T>
, and have only a single function: GetEnumerator. We’ll talk about Enumerators in a moment, but basically this function says “Give me an object which will iterate over this Enumerable”. Common classes which implement IEnumerable are the List<> object and built-in Arrays.
Iterators
Although represented as the IEnumerator
interface in C#, I prefer to use the term iterator because of the similarity of the other terms. An iterator is an object which provides A) the current object that is being inspected and B) a method to continue to the next object, which returns whether or not it has reached the end.
Iterators are analogous to using a traditional for
loop:
int[] array;
for(int i = 0; i < length; i++)
{
int current = array[i];
}
Each expression in this traditional for
loop maps to a function of the IEnumerator.
int i = 0
Declare the iterator
i < length
Are there any more values to inspect?
i++
Move to the next element
int current = array[i];
Inspect the current element
The other feature that an IEnumerator
requires is the ability to start over. Iterators must implement Reset()
so that they can be run again.
Foreach
In .NET/Mono, Enumerators and Iterators are integrated with several language features. This makes C# a great tool for software that involves handling sets and subsets of objects (read: most applications!).
One common example of using iterators is the foreach
syntax. It might not be immediately obvious, but you can foreach any object that implements the IEnumerable
interface.
int[] array;
foreach(int current in array)
{
}
Although this is the example takes up the same vertical space as the for
loop example above, it is easier to understand and writing it is less prone to mistakes. This syntax roughly compiles to:
int[] array;
IEnumerator<int> iterator = array.GetEnumerator();
while(iterator.MoveNext())
{
int current = iterator.Current;
}
Notice the GetEnumerator()
function call. This function satisfies the implementation of IEnumerable, which the built-in array implements.
Auto-Generated Iterators
One of my favorite features in C# are auto-generated iterators. The IEnumerator
and IEnumerable
interfaces are very powerful, but require a fair amount of boilerplate code in order to write simple thoughts. As is the case with many of the language features in C#, auto-generated iterators allow you to skip writing the boilerplate and keep the code you write focused on the functionality you are trying to express.
Auto-Generated Iterators can be declared simply by writing a function that returns IEnumerator
, and includes the yield
syntax.
public IEnumerator GetCoolestAnimals ()
{
yield return "Dog";
yield return "Cat";
yield return "Platypus"
yield break;
}
Because this returns an enumerator, it can be used to iterate over all returned values:
foreach(object animal in GetCoolestAnimals ())
{
}
One thing you might notice in the above examples is that the variable in the foreach is cast as type ‘object’ rather than string. Even though the GetCoolestAnimals
function returns all strings, the IEnumerator
interface only supports objects. Luckily, there are generic versions of both IEnumerator
and IEnumerable
:
public IEnumerator<string> GetCoolestAnimals ()
{
yield return "Dog";
yield return "Cat";
yield return "Platypus";
yield break;
}
public void PrintCoolestAnimals ()
{
// Prints:
// Dog
// Cat
// Platypus
foreach(string animal in GetCoolestAnimals ())
{
Console.WriteLine(animal);
}
}
Voila!
Tip: When you write your foreach
, you can specify the type even if the IEnumerator is not generic, but beware that this is doing a runtime cast and will throw an exception if the IEnumerator finds an object that cannot be cast to the specified type.
Tip: In the above examples, yield break;
is analogous to return;
and immediately exits the iterator.
Warning: Memory Concerns
When using enumerables and iterators, it’s important to understand the downsides. .NET/Mono are garbage collected runtimes which means that memory doesn’t have to be manually de-allocated, but performance problems can occur when you are discarding too much memory. This is particularly a concern in high-performance applications like video games or VR.
IEnumerator
and IEnumerable
are interfaces, which means that they can be a variety of types. Most of the built-in collections like List, Dictionary, or Hashtables implement IEnumerable so that they can be iterated over simply with ForEach. However, it’s important to note that the Iterator is a separate type from the Enumerable and a new one must be created each time the foreach
syntax is used. This is very evident when you write your own IEnumerable implementations, because you are in charge of manually creating the iterator and returning it. These generally live on the heap because an IEnumerator object could be stored an referenced outside the scope of the function call.
Of course, .NET/Mono have taken steps to prevent this from causing performance problems when using their built-in collections in the specific instance of foreach, but this is not the case for auto-generated iterators. Not only do the iterators need to be created fresh each time you call the method, they have a much larger memory footprint than just an index and an iterator. Every single local variable that you declare in the body of an auto-generated iterator will actually live as a member variable on the iterator object. On the backend, these iterators are actually compiled as classes which implement the necessary boilerplate code and store the necessary state for your iterator to work. This can produce some very big memory problems if used heavily and the IEnumerators should be cached if called frequently.
Further Reading
For the sake of brevity, I’ve had to leave out a few topics that are related to iterators so I will post some quick summaries for your further reading. The world of the C# language and .NET/Mono runtimes is wide, so I can only cover so much at a time. If you are interested in reading more about this topic, please send me a message and I’ll write up some more!
- LINQ - Query syntax similar to SQL for working with iterators.
- LINQ Extensions - A set of helper functions that make it easier to manipulate iterators.
- Async - Newer built-in syntax that lets you await the completion of a task before continuing.