Reading Microsoft documentation when we have all that stackoverflow questions with answers usually seems a waste of time. And as a typical, lazy programmer I admit that (much too often) I agree with that sentence ?. But today I decided to check how Enumerable.Empty<T>() works and wasn’t satisfied with the knowledge I got. I started digging a little bit deeper and wasted some of the precious beauty-sleep time so if you want to know what’s happening when you add an element to an empty collection, let me tell you a story of today’s night.
No, not that story, you sick head!
Enumerable.Empty<T>() – what is that?
This is just another way of assigning an empty collection to an IEnumerable type. As a starting value, a value passed to method or something else. Not a big deal ;).
List<int> numbers1 = Enumerable.Empty<int>().ToList(); IEnumerable<int> numbers2 = Enumerable.Empty<int>(); IList<IVehicle> vehicles = Enumerable.Empty<IVehicle>().ToList();
Now you can use the above variables to anything – you can add elements to these collections and so on. No null-reference-exception-fears ?.
But how does it work?
As you can see here, Empty() is just a static method returning Instance from EmptyEnumerable class…
public static IEnumerable<TResult> Empty<TResult>() { return EmptyEnumerable<TResult>.Instance; }
Ooook, so let’s go one step furher… What the heck do we have in EmptyEnumerable? ?
// We have added some optimization in SZArrayHelper class to cache the enumerator of zero length arrays so // the enumerator will be created once per type. internal class EmptyEnumerable<TElement> { public static readonly TElement[] Instance = new TElement[0]; }
Oh look, just a static property of a 0-sized array! So every time we call Enumerable.Empty<int>(); we get the same array. So when we call Enumerable.Empty<Vehicle>() 100 times, again there will be just 1 array more in the memory instead of 100…
Wait, how was it called, a singleton or something else equally hated? ?. Just kidding, singletons are sweet <3… Sometimes :P.
Ok, coming back to topic, before we leave this code, let’s have a look at an interesting comment above:
We have added some optimization in SZArrayHelper class to cache the enumerator of zero length arrays so the enumerator will be created once per type.
Seems legit ?.
Ok, but what if I create 2 List<int> variables and both will be assigned with Enumerable.Empty<int>()? Will both variables reference the same place in memory, or what? And if so, adding an element to one variable would affect the other variable, am I right?
IList<IVehicle> vehicles = Enumerable.Empty<IVehicle>().ToList(); IList<IVehicle> vehicles2 = Enumerable.Empty<IVehicle>().ToList(); vehicles.Add(new Car("A")); vehicles.Add(new Car("B")); Console.WriteLine(vehicles.Count == vehicles2.Count);
Well, the answer of course is: ‘not at all’! ? But why? But how? Let’s come back to the Microsoft code (I just love them for opening their sources, really!)
When we analyze one of the collections (for example – a List.cs), we will see what’s happening when we call .Add() method on our collection.
I want to start from the beginning so let’s look at the class’ fields and constructor.
public class List<T> : IList<T>, System.Collections.IList, IReadOnlyList<T> { //… private T[] _items; [ContractPublicPropertyName("Count")] private int _size; // Constructs a List. The list is initially empty and has a capacity // of zero. Upon adding the first element to the list the capacity is // increased to 16, and then increased in multiples of two as required. public List() { _items = _emptyArray; }
As you can see, all the data we store in our collection are saved in a private array called _items.
When we go to the Add() method we will see this:
//Adds the given object to the end of this list. The size of the list is // increased by one. If required, the capacity of the list is doubled // before adding the new element. public void Add(T item) { if (_size == _items.Length) EnsureCapacity(_size + 1); _items[_size++] = item; _version++; }
What is happening here is not a rocket science but let me emphasize that before placing a new element at the very end of our _items array, we check if the size of the collection (seen from the outside as .Count) equals the length of our _items variable. Well, considering our scenario (adding a new value to an empty array) – the condition is true, so we call EnsureCapacity(_size + 1) at the very beginning of the Add() method.
Sooo, it’s time for visiting EnsureCapacity() method to check what is happening there.
In this method we check if our array’s size is less than passed parameter and this condition is of course true for us (because we have 0-size array and we passed “1” as the parameter to EnsureCapacity method).
// Ensures that the capacity of this list is at least the given minimum // value. If the currect capacity of the list is less than min, the // capacity is increased to twice the current capacity or to min, // whichever is larger. private void EnsureCapacity(int min) { if (_items.Length < min) { int newCapacity = _items.Length == 0? _defaultCapacity : _items.Length * 2; // Allow the list to grow to maximum possible capacity (~2G elements) before encountering overflow. // Note that this check works even when _items.Length overflowed thanks to the (uint) cast if ((uint)newCapacity > Array.MaxArrayLength) newCapacity = Array.MaxArrayLength; if (newCapacity < min) newCapacity = min; Capacity = newCapacity; } }
What is happening then (in our scenario) is just setting 1 to Capacity property. Well, ok, we still don’t know how is it possible that when adding an element to one array, the other one doesn’t get it, although they are both assigned to the same static array …
So let’s have a peek in Capacity property as a last credit given to Microsoft… Oh just look at the comment
the internal array used to hold items. When set, the internal array of the list is reallocated to the given capacity.
Seems promising, isn’t it? 😛
// Gets and sets the capacity of this list. The capacity is the size of // the internal array used to hold items. When set, the internal // array of the list is reallocated to the given capacity. public int Capacity { //… set { if (value < _size) { ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.value, ExceptionResource.ArgumentOutOfRange_SmallCapacity); } Contract.EndContractBlock(); if (value != _items.Length) { if (value > 0) { T[] newItems = new T[value]; if (_size > 0) { Array.Copy(_items, 0, newItems, 0, _size); } _items = newItems; } else { _items = _emptyArray; } } } }
Aaaand, what a big surprise, when adding a new item to an array and the size of an array is less than the number of already existing elements plus one, it will allocate brand new array and copy all the existing items to the new variable.
So, now it seems reasonable ?.
So is it better to call .Empty method instead of new ‘IEnumerable'<T>()?
Seems like… yes ?. With Empty() method we use cached array instead of creating a new one. It can somehow positively affect performance because our code will less often bother Garbage Collector.
Another meaningful advantage of using .Empty() is fact, that everybody reading your code will immediately see that you wanted assign/pass to a method an empty collection.
So in my opinion – it’s worth using.