TheDeveloperBlog.com


C# Dictionary Order

Dictionary order. The order you add keys to a Dictionary is important. It affects the performance of accessing those keys. Because the Dictionary uses a chaining algorithm, the keys that were added last are often faster to locate.

Note: You have a key to add to your Dictionary. This key may collide with a non-equal key in the Dictionary.

Note 2: If you add it first, it will be slower to search for it. If you add it last, it will be faster to search for it.


Example. First, this program is a complete benchmark harness that adds ten million and two random path strings to a Dictionary collection in the C# language. A reference to the first and last keys added are stored as variables.

Because each key is unique, no exceptions are thrown. The two tight loops then show the performance difference between looking up entries in the Dictionary that were added first, and last.

C# program that tests dictionary performance and order

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;

class Program
{
    const int _max = 10000000;
    static void Main()
    {
	double msA = 0, msB = 0;
	//
	// Ten tests of dictionary order.
	//
	for (int y = 0; y < 10; y++)
	{
	    var dictionary = new Dictionary<string, int>(StringComparer.Ordinal);
	    string first = Path.GetRandomFileName();
	    dictionary.Add(first, 1);
	    for (int i = 0; i < 1000000; i++)
	    {
		dictionary.Add(Path.GetRandomFileName(), 1);
	    }
	    string last = Path.GetRandomFileName();
	    dictionary.Add(last, 1);
	    //
	    // Test the dictionaries.
	    //
	    var s1 = Stopwatch.StartNew();
	    for (int i = 0; i < _max; i++)
	    {
		int v;
		dictionary.TryGetValue(first, out v);
	    }
	    s1.Stop();
	    var s2 = Stopwatch.StartNew();
	    for (int i = 0; i < _max; i++)
	    {
		int v;
		dictionary.TryGetValue(last, out v);
	    }
	    s2.Stop();
	    msA += s1.Elapsed.TotalMilliseconds;
	    msB += s2.Elapsed.TotalMilliseconds;
	    Console.Write(".");
	    //
	    // Collect the garbage.
	    //
	    GC.Collect();
	}
	Console.WriteLine();
	Console.WriteLine(msA.ToString("0.00") + " ms");
	Console.WriteLine(msB.ToString("0.00") + " ms");
	Console.Read();
    }
}

Output
    (Second number is the last key search time elapsed.)

..........
3539.57 ms
3203.58 ms [Faster]

This program is a benchmark harness for testing the performance of looking up a key that was added first to the Dictionary, and another key that was added last to it. All the keys added to the Dictionary instance are random strings.

And: A reference is retained to the first and last keys added. This involves two string variables.

I varied the order of the two tests. This did not affect the performance of the lookups. I added the forced garbage collection call. Otherwise the runtime might start a garbage collection during one of the two tests.

And: The program uses completely random path strings, so results will vary each pass through the loop.

The last key added to the Dictionary was faster to access than the first key that was added. Usually the performance gain was around 10% faster for a Dictionary with ten million separate keys.


Discussion. This section explains why the last key can be accessed faster than the first key. The Dictionary type internally uses a chaining algorithm. Each key you look up is converted to an index through the hash code computation.

Note: Each hash code is not unique. And each index is even less likely to be unique. Collisions are stored in a logical linked list.

Steps. When you add a key to the Dictionary and another key was added at this index, the new key is put in the first place in this logical linked list. And the old key is demoted to the next slot.

Then: When you look up that key, the search algorithm loops through those entries in order, with the last key checked first.


Performance. The performance impact of changing the ordering of the keys you add will likely not be large. For this reason, I emphasize the theory and implementation issues rather than offering a recommendation or suggestion about performance.

Generally: The Dictionary lookup performance is good regardless of ordering if the hash code computation is well-distributed.

However: For certain in-memory database programs, adding keys in order of ascending hit frequency could improve performance.


Summary. We saw a performance test for Dictionary that showed that the first key added to a Dictionary instance is slower to access than the last key. We noted that this is because of the internal chaining algorithm used in the Dictionary.