High-level considerations, and factors external to the code, are often most important. But there exist many low-level performance optimizations. We explore these.
Intro. We first describe the general considerations when optimizing your C# code. The C# language is compiled, and with the .NET Framework, we attain performance close to languages such as C or C++.
Based on: .NET 3.5 .NET 4 .NET 4.5
Generally, using the simplest features of the language provides the best performance. For example, using the for-loop and avoiding parameters and return values is typically fastest. We balance these performance goals with readability.
Further: It is best to focus on the "hot paths" in your program for optimizations. Don't emphasize code that is rarely run.
Benchmark. At all levels of performance optimization, you should be taking measurements on the changes you make to methods. You can do this with the .NET Framework methods available in the Stopwatch type.
Tip: It often pays to create console programs where the methods are benchmarked repeatedly on data as it changes.
Note: You should always avoid regressing performance unless there is a clear reason to do so.
Static methods. Non-inlined instance methods are always slower than non-inlined static methods. To call an instance method, the instance reference must be resolved, to determine what method to call. Static methods do not use an instance reference.
If you look at the intermediate language, you will see that static methods can be invoked with fewer instructions. You can see an experiment based on the callvirt and call instructions on this site.
Arguments. When you call any method that was not inlined, the runtime will actually physically copy the variables you pass as arguments to the formal parameter slot memory in the called method. This causes stack memory operations.
Therefore: It is faster to minimize arguments, and even use constants in the called methods instead of passing them arguments.
Locals. When you call a method in your C# program, the runtime allocates a separate memory region to store all the local variable slots. This memory is allocated on the stack even if you do not access the variables in the function call.
Tip: You can call methods faster if they have fewer variables in them. Sometimes we can reuse the same variable.
One way to do this is to isolate rarely used parts of methods in separate methods. This makes the fast path in the called method more efficient, which can have a significant performance gain.
Also: Often you can use a local variable to copy in a field. Then you can avoid accessing the field—by only changing the local.
Constants are not assigned a memory region, but are instead considered values. Therefore, you can never assign a constant, but loading the constant into memory is more efficient. It is injected directly into the instruction stream.
And: This eliminates any memory accesses outside of the memory, improving locality of reference.
Static fields are faster than instance fields, for the same reason that static methods are faster than instance methods. When you load a static field into memory, you do not need the runtime to resolve the instance expression.
Loading an instance field must have the object instance first resolved. Even in an object instance, loading a static field is faster because no instance expression instruction is ever used.
Inline. Until .NET 4.5, the C# language had no ability to suggest a method be inlined into its enclosing method call spots. And the .NET Framework is often conservative. It will not inline medium-sized or large methods.
However: You can manually paste a method body into its call spot. Typically, this improves performance in micro-benchmarks.
Also: It is easy to do. But it will make code harder to modify. It is only suggested for a few critical spots in programs.
Tip: In .NET 4.5, the AggressiveInlining enum was introduced. This tells the compiler to inline if possible.
Switch. You will find that the switch statement compiles in a different way than if-statements typically do. For example, if you use a switch on an int, you will often get jump statements, which are similar to a computed goto mechanism.
Note: Using jump tables makes switches much faster than some if-statements. Also, using a char switch on a string is fast.
Flatten arrays. Using two-dimensional arrays is relatively slow. You can explicitly create a one-dimensional array and access it through arithmetic that supposes it is a two-dimensional array. This is sometimes called flattening an array.
Then: You must use multiplication and addition to acquire the correct element address. This optimization often improves performance.
Jagged arrays. Flattened arrays are typically most efficient. But they are impractical. You can use jagged arrays to improve the lookup performance. The .NET Framework enables faster accesses to jagged arrays than to 2D arrays.
Caution: Jagged arrays may cause slower garbage collections—each jagged array element is treated separately by the garbage collector.
StringBuilder. If you are doing significant appending of strings using the C# language, the StringBuilder type can improve performance. This is because the string type is immutable and can not be changed without reallocating the entire object.
Sometimes, using strings instead of StringBuilder for concatenations is faster. This is typically the case when using small strings or doing infrequent appends. With most loops, StringBuilder is the better choice.
Char arrays. Using char arrays is sometimes the fastest way to build up a string. Typically, we combine char arrays with for-loops and character testing expressions. This logic is more painful to develop, but the time savings can be significant.
Byte arrays. In the C# language, the smallest unit of addressable storage is the byte type. You can store ASCII characters in a single byte, as well as small numbers. If you can store your data in an array of bytes, this allows you to save memory.
For example, an array of characters or a string uses two bytes per character. An array of bytes can represent that data in one byte per character. This results in about half the total memory usage.
Arrays. We have many options for collections, such as the List or ArrayList. These types are convenient and should be used when necessary. But it is always more efficient to use a simple array if this is possible.
Note: Complex collections such as List and Dictionary are actually composed of internal arrays.
And: They add logic to avoid the burden of managing the array size on each use. But if you do not need this logic, an array is faster.
Capacity. For collections, you can use an optional capacity argument to influence the initial buffer sizes. It is best to pass a reasonable parameter when creating a Dictionary or List. This avoids many allocations when adding elements.
Rewrite loops. You can rewrite loops to improve performance. The foreach-loop has good performance in many cases. But it is best to use the for-loop in all performance-critical sections when possible.
Note: For-loops sometimes have better raw performance. And you can often reuse the index variable to optimize elsewhere.
Typically, the while-loop, the for-loop and the do-while loop have the best performance. Also, it is sometimes beneficial—and sometimes harmful—to "hoist" the maximum loop variable outside of the for-loop statement.
Structs. It is typically best to entirely avoid structs. If you use structs, you must avoid passing the struct as a parameter to methods. Otherwise, performance may degrade to worse than using a class type.
Note: In the .NET Framework, structs are copied in their entirety on each function call or return value.
Structs can improve the performance of the garbage collector by reducing the number of distinct objects. Also, you can sometimes use separate arrays instead of arrays of structs, which can improve performance further.
Lookup tables. While switch statements or hashtables can provide good performance, using a lookup table is frequently the optimal choice. Instead of testing each character using logic to lowercase, you can translate it through a lookup table.
Also, the lookup table can be implemented as a character array. Another example is that you can implement the ROT13 algorithm with a lookup table, improving performance by more than two times.
Char argument. Often, you may need to pass a single character to a method as an argument. For example, the StringBuilder type allows you to append a single char. The Response.Write method also allows you to write a single char.
Note: It is more efficient to pass a char instead of a single-char string. We benchmark and show this optimization.
Tip: The char is a value type, and is represented by two bytes, while a string is a reference type and requires over 20 bytes.
ToString. It is poor programming to use the ToString method when it is not needed. Sometimes, developers will call ToString on a character in a string, and then test it against a single-character string literal. This is inefficient.
Instead: Use a character testing expression with two chars. Please reference the specific article on this topic for more details.
Caution: This mistake sometimes results in code that is ten times slower than the correct approach.
Int string cache. Many C# programs use the ToString method on integer values frequently. This requires an allocation on the managed heap for the new string. This will cause the next garbage collection to become slower.
Tip: You can use a lookup table cache to optimize common cases for the integer ToString operation.
And: This site demonstrates how this lookup table can make the ToString method thirty times faster.
IL Disassembler. For .NET development, open your methods with the IL Disassembler tool provided by Microsoft. This is a free tool and it provides an interface to view the MSIL (Microsoft Intermediate Language) output of compiled Release executables.
Tip: It is sometimes useful to save copies of the intermediate language as you make changes, or to even count instructions.
Avoid sorting. Often, you can avoid performing a sort operation on an array or string by testing whether the input string or array is already sorted. Sometimes this makes a big performance improvement. In other cases, this slows down programs.
String conversions. You can actually avoid many string-based conversions. For example, you may need to ensure that a string is lowercased. If the string is already lowercase, you can avoid allocating a new string.
However: The Framework ToLower method will not avoid this. You must manually test to see if no lowercasing is necessary.
Avoid Path. The Path methods in the System.IO namespace are somewhat slow for many applications. Sometimes they can cause unnecessary allocations to occur, copying strings more than once. We can avoid them with character-based algorithms.
List Clear. Avoiding allocations is sometimes (but not always) faster. In some programs, you can call Clear on a Dictionary to avoid re-creating a new Dictionary. But in my testing, calling Clear on a List is slow.
Instead, just re-creating the List is faster. This is because the garbage collector is more optimized for eliminating old data than the method. This is not always the case. Experimentation and benchmarking is needed.
Hash. It is important that you use hashtables in your programs when appropriate. The Dictionary collection in the .NET Framework is not optimal in many cases, but provides good performance in many different situations.
Tip: Knowing every detail of the hashtable type, in whatever language you are using, is nearly always a performance advantage.
Learn. The site you are reading contains a multitude of optimization experiments, often proven with benchmarks that provide times in nanoseconds per method call. Resources such as this site are invaluable for certain tasks in programming.
Note: Before Dot Net Perls came about, no site had this information on optimization. We should learn from each other.
Compiler theory. Experimentation such as benchmarking and analyzing instructions generated can result in excellent program performance. But without understanding the core theories of compilers you may be lacking knowledge about program performance.
However: Compiler theory involves advanced mathematics and can be dense to start with.
My observation is that few application developers have a significant knowledge of compiler theory. This topic may be more suitable to academic computer scientists and not rapid application development programmers.
Tip: You too can slay the dragon with syntax directed translation. The compiler is mightier than the sword.
Temporal locality. Another way you can optimize a program significantly is by rearranging it to increase temporal locality. This means that methods that act on a certain part of memory (such as the hard disk) are run at all once.
Undecidable. The term "optimization" is a misnomer in computer science. A program can never be truly optimized. Because compiler theory is undecidable, a program can never be proven to be optimally efficient. Perhaps another approach is faster.
However: One way we improve the situation is by extensively testing. Processors change. Theory only gets us so far.
Secrets. There are many pages on this website that are focused on optimization tips. These pages are listed below. Most of them rewrite a certain pattern of code to something arguably more efficient.
Array OptimizationChar LowercaseDecrement OptimizationDictionary OptimizationException Optimizationint.Parse OptimizationInteger AppendMask OptimizationParameter OptimizationReplace OptimizationToString Formats
Tip: Please be aware some of these optimizations result in code that is less maintainable. Not all secrets are useful ones.
Research. I found many optimization tips (for various languages) in Code Complete. I recommend this book for some tips. Some of the strategies may not be useful in managed languages like C#, but many are still relevant.
Use a high-quality design. Make the program right. Make it modular and easily modifiable so that it's easy to work on later. When it's complete and correct, check the performance. If the program lumbers, make it fast and small. Don't optimize until you know you need to.
Summary. There are many optimizations you can make at the level of statements and methods. External performance factors (including hardware) are more significant for many programs. But these tips can help in a significant way.