TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

<< Back to C-SHARP

C# Intermediate Language (IL)

Understand the .NET Framework intermediate language. The IL is composed of instructions.
Intermediate language. Code exists on many levels. Abstraction is a ladder. At the highest levels of abstraction, we have C# code. At a lower level, we have an intermediate language.
Instructions. The IL has instructions (opcodes) that manipulate the stack. These instructions, called intermediate language opcodes, reside in a relational database called the metadata.
Intro. Compiler theory has 2 parts. In analysis, source code is parsed and a symbol table is derived. Synthesis is where the object code is generated and written to an executable.

Analysis: Source code is parsed and a symbol table is derived. Analysis comes before synthesis.

Synthesis: Object code is generated and written to an executable. Analysis and synthesis result in an intermediate form.

Info: On execution of the intermediate form, another compiler, the JIT compiler, will be invoked and perform even more optimizations.

Example program: C# using System; class Program { static void Main() { int i = 0; while (i < 10) { Console.WriteLine(i); i++; } } } Intermediate language for Main method: IL .method private hidebysig static void Main() cil managed { .entrypoint .maxstack 2 .locals init ( [0] int32 num) L_0000: ldc.i4.0 L_0001: stloc.0 L_0002: br.s L_000e L_0004: ldloc.0 L_0005: call void [mscorlib]System.Console::WriteLine(int32) L_000a: ldloc.0 L_000b: ldc.i4.1 L_000c: add L_000d: stloc.0 L_000e: ldloc.0 L_000f: ldc.i4.s 10 L_0011: blt.s L_0004 L_0013: ret }
IL for Main method. In the second part of the text, you can see the intermediate form of the Main method. This is generated from the C# code.

And: At L_0000, you can see that the constant number zero is pushed onto the evaluation stack through the instruction ldc.

const
Branch. The instruction "br" indicates a branch. This is a conditional instruction that will change what instruction is executed next based on the condition.

Tip: In the C# language, things such as goto, while, for, break, and return may be implemented with variants of the branch instruction.

Call method. Before Console.WriteLine is called, the location of the local variable we are writing is pushed to the evaluation stack. This is used as an argument.Console
Increment. In IL, an increment is implemented by pushing a constant one to the evaluation stack, and then using the add arithmetic instruction to increment.
Loop instructions. The line L_0011 implements another instruction that returns control to the beginning of the loop if the condition is met. This is another branch.

Note: Branch instructions use labels in the intermediate language. Branches are basically goto statements at the lower level.

IL Disassembler. To gain an understanding of the C# language, I recommend that you inspect the intermediate language for all the code you write.IL Disassembler

Tip: IL Disassembler is provided with Visual Studio. We use it to look at the intermediate language.

Bne. If-statements are translated into branch instructions. One such intermediate language instruction, bne, stands for branch if not equal.

First: The important method here is called A(). If its argument is equal to 1, it writes something to the console.

Here: The constant 1 is pushed with ldc. And the bne instruction is used. It branches based on the constant.

Finally: If the two values on the top of the stack (the constant 1 and the argument int) are not equal, we jump forward to line IL_000f.

C# program that uses bne instruction using System; class Program { static void Main() { A(0); A(1); A(2); } static void A(int value) { if (value == 1) { Console.WriteLine("Hi"); } else { Console.WriteLine("Bye"); } } } Output Bye Hi Bye A method: IL .method private hidebysig static void A(int32 'value') cil managed { // Code size 26 (0x1a) .maxstack 8 IL_0000: ldarg.0 IL_0001: ldc.i4.1 IL_0002: bne.un.s IL_000f IL_0004: ldstr "Hi" IL_0009: call void [mscorlib]System.Console::WriteLine(string) IL_000e: ret IL_000f: ldstr "Bye" IL_0014: call void [mscorlib]System.Console::WriteLine(string) IL_0019: ret } // end of method Program::A
Call. A call instruction invokes a static or Shared method. We learn about the call instruction in the .NET Framework's intermediate language.

Example: In the Main method, we create an instance of Program and call its instance method A. Finally we call the static method B.

C# program that uses instance and static methods class Program { void A() { } static void B() { } static void Main() { Program p = new Program(); p.A(); Program.B(); } } IL .method private hidebysig static void Main() cil managed { .entrypoint // Code size 18 (0x12) .maxstack 1 .locals init ([0] class Program p) IL_0000: newobj instance void Program::.ctor() IL_0005: stloc.0 IL_0006: ldloc.0 IL_0007: callvirt instance void Program::A() IL_000c: call void Program::B() IL_0011: ret } // end of method Program::Main
Callvirt. This is an intermediate language instruction. It is used on instance methods. The call instruction meanwhile is used on static methods.
Code that has an instance method: C# class Test { public int M() { return 0; } } class Program { static void Main() { Test t = new Test(); int i = t.M(); } } IL of Main method .method private hidebysig static void Main() cil managed { .entrypoint .maxstack 1 .locals init ( [0] class Test t) L_0000: newobj instance void Test::.ctor() L_0005: stloc.0 L_0006: ldloc.0 L_0007: callvirt instance int32 Test::M() L_000c: pop L_000d: ret } Code that has a static method: C# class Test { static public int M() { return 0; } } class Program { static void Main() { int i = Test.M(); } } IL of Main method .method private hidebysig static void Main() cil managed { .entrypoint .maxstack 8 L_0000: call int32 Test::M() L_0005: pop L_0006: ret }
Callvirt benchmark. I was working on a program where performance is essential and decided to replace two instance method calls with static method calls.

Result: I found a 1.23% slowdown caused by the callvirt instruction being used.

Callvirt instruction performance data Instance method and data: 44554 ms Static method and data: 44008 ms [faster]
Ldarg. This instruction is common. It places an argument onto the evaluation stack. There are several common variants of ldarg in the .NET Framework.

Note: In the IL section of the code example, we see the ldarg.0 and ldarg.1 instructions.

Positions: The 0 and 1 refer to the position of the argument list (formal parameter list).

And: This means ldarg.0 pushes an int32 onto the evaluation stack, and ldarg.1 pushes a string onto the evaluation stack.

Method that has two parameters: C# static int Something(int a, string b) { a += 1; int y = b.Length + a; return y; } Intermediate Language: IL .method private hidebysig static int32 Something(int32 a, string b) cil managed { // Code size 16 (0x10) .maxstack 2 .locals init ([0] int32 y) IL_0000: ldarg.0 IL_0001: ldc.i4.1 IL_0002: add IL_0003: starg.s a IL_0005: ldarg.1 IL_0006: callvirt instance int32 [mscorlib]System.String::get_Length() IL_000b: ldarg.0 IL_000c: add IL_000d: stloc.0 IL_000e: ldloc.0 IL_000f: ret } // end of method Program::Something
Ldc. With the ldc instruction, the .NET Framework loads constant values (such as integers) onto the evaluation stack, making many program constructs possible.

Tip: This program uses six constant values. We pass each constant to Console.WriteLine.

Tip 2: For the values 0 and 8, we find the instructions ldc.i4.0 and ldc.i4.8 are used.

Result: These load a 4-byte integer with value 0 and 8 to the top of the stack.

C# program that uses ldc instructions using System; class Program { static void Main() { Console.WriteLine(0); Console.WriteLine(8); Console.WriteLine(9); Console.WriteLine(-1); Console.WriteLine('a'); Console.WriteLine(1.23); } } IL .method private hidebysig static void Main() cil managed { .entrypoint // Code size 47 (0x2f) .maxstack 8 IL_0000: ldc.i4.0 IL_0001: call void [mscorlib]System.Console::WriteLine(int32) IL_0006: ldc.i4.8 IL_0007: call void [mscorlib]System.Console::WriteLine(int32) IL_000c: ldc.i4.s 9 IL_000e: call void [mscorlib]System.Console::WriteLine(int32) IL_0013: ldc.i4.m1 IL_0014: call void [mscorlib]System.Console::WriteLine(int32) IL_0019: ldc.i4.s 97 IL_001b: call void [mscorlib]System.Console::WriteLine(char) IL_0020: ldc.r8 1.23 IL_0029: call void [mscorlib]System.Console::WriteLine(float64) IL_002e: ret } // end of method Program::Main
Ldloc. How are local variables used in the intermediate language? In IL locals are allocated separately and accessed by indexes with the ldloc instruction.

Note: In the intermediate language, the ".locals" section indicates that four locals are allocated each time the method is called.

Info: The four local variables are indexed by the numbers 0, 1, 2 and 3. These indexes are used later in the IL.

Method that introduces local variables: C# static int Example() { int a = 1; string y = "cat"; bool b = false; int result = a + y.Length + (b ? 1 : 0); return result; } Intermediate language: IL .method private hidebysig static int32 Example() cil managed { // Code size 29 (0x1d) .maxstack 3 .locals init ([0] int32 a, [1] string y, [2] bool b, [3] int32 result) IL_0000: ldc.i4.1 IL_0001: stloc.0 IL_0002: ldstr "cat" IL_0007: stloc.1 IL_0008: ldc.i4.0 IL_0009: stloc.2 IL_000a: ldloc.0 IL_000b: ldloc.1 IL_000c: callvirt instance int32 [mscorlib]System.String::get_Length() IL_0011: add IL_0012: ldloc.2 IL_0013: brtrue.s IL_0018 IL_0015: ldc.i4.0 IL_0016: br.s IL_0019 IL_0018: ldc.i4.1 IL_0019: add IL_001a: stloc.3 IL_001b: ldloc.3 IL_001c: ret } // end of method Program::Example
Ldsfld. In addition to storing values in static fields, there exists an intermediate language instruction to load values from them.

Field: This program assigns the static field _a to 3. This is done with stsfld.

Next: The ldsfld pushes the value stored in the location of _a onto the top of the evaluation stack.

Then: We load the constant 4 onto the top of the evaluation stack with the ldc.i4.4 instruction.

C# program that has ldsfld instruction using System; class Program { static int _a; static void Main() { Program._a = 3; if (Program._a == 4) { Console.WriteLine(true); } } } IL .method private hidebysig static void Main() cil managed { .entrypoint // Code size 21 (0x15) .maxstack 8 IL_0000: ldc.i4.3 IL_0001: stsfld int32 Program::_a IL_0006: ldsfld int32 Program::_a IL_000b: ldc.i4.4 IL_000c: bne.un.s IL_0014 IL_000e: ldc.i4.1 IL_000f: call void [mscorlib]System.Console::WriteLine(bool) IL_0014: ret } // end of method Program::Main
Newarr. The newarr instruction creates a vector. This is a one-dimensional array. Vectors have special support in the .NET Framework.

Quote: One-dimensional arrays that aren't zero-based, and multi-dimensional arrays, are created using newobj rather than newarr (The CLI Annotated Standard).

Next: This program creates an int array of ten elements. The first element is initialized so the compiler does not eliminate any code.

IL: The constant 10 is loaded onto the evaluation stack before newarr is encountered. This causes the array to have ten elements.

C# program that creates new array using System; class Program { static void Main() { int[] array = new int[10]; array[0] = 1; } } Intermediate representation: IL .method private hidebysig static void Main() cil managed { .entrypoint // Code size 13 (0xd) .maxstack 3 .locals init ([0] int32[] 'array') IL_0000: ldc.i4.s 10 IL_0002: newarr [mscorlib]System.Int32 IL_0007: stloc.0 IL_0008: ldloc.0 IL_0009: ldc.i4.0 IL_000a: ldc.i4.1 IL_000b: stelem.i4 IL_000c: ret } // end of method Program::Main
Ret. Every method executed has a ret instruction. This returns control from the method to the calling method. The instruction sometimes returns a value.

Next: Let's look at a simple C# method that returns an integer value. This method adds two to the argument.

Tip: You can see the argument is accessed with ldarg.0. Finally, the ret instruction appears at the end of the method.

Stack: The evaluation stack at the time ret is executed has the result of the add instruction on it. This is returned.

Method that returns integer: C# static int Method(int value) { return 2 + value; } Intermediate language .method private hidebysig static int32 Method(int32 'value') cil managed { // Code size 4 (0x4) .maxstack 8 IL_0000: ldc.i4.2 IL_0001: ldarg.0 IL_0002: add IL_0003: ret } // end of method Program::Method
Stelem. This stores a value inside an array element. Its performance depends on the type of the element and array. We test stelem in some C# programs.

Newarr: When you assign elements in an array, the array itself is pushed onto the stack (newarr).

Next: The index you are using in the array is pushed onto the stack (ldc), and the value you want to put into the array is pushed (ldloc).

Program 1 in C# class Program { static void Main() { char[] c = new char[100]; c[0] = 'f'; } } Program 1 in IL .method private hidebysig static void Main() cil managed { .entrypoint .maxstack 3 .locals init ( [0] char[] c) L_0000: ldc.i4.s 100 L_0002: newarr char L_0007: stloc.0 L_0008: ldloc.0 L_0009: ldc.i4.0 L_000a: ldc.i4.s 0x66 L_000c: stelem.i2 L_000d: ret } Program 2 in C# class Program { static void Main() { int[] i = new int[100]; i[0] = 5; } } Program 2 in IL .method private hidebysig static void Main() cil managed { .entrypoint .maxstack 3 .locals init ( [0] int32[] i) L_0000: ldc.i4.s 100 L_0002: newarr int32 L_0007: stloc.0 L_0008: ldloc.0 L_0009: ldc.i4.0 L_000a: ldc.i4.5 L_000b: stelem.i4 L_000c: ret }
Stelem benchmark. Could certain value type elements be faster than others? My previous research in C indicated that the unsigned integer data type is most efficient.

Result: Uint32 was the fastest element type to set in an array. Char was slowest of the three tests.

Benchmark results char array: 5208 ms int32 array: 4321 ms uint32 array: 4125 ms
Stsfld. What is the stsfld instruction? This stores the value on top of the evaluation stack into a static field. We demonstrate this instruction.

However: In the intermediate language, we load a constant value 3 (ldc.i4.3) onto the top of the evaluation stack.

Then: We have a stsfld with the argument being the address of _a. So the top value (3) is stored into the location pointed to by _a.

C# program that has stsfld instruction using System; class Program { static int _a; static void Main() { Program._a = 3; if (Program._a == 4) { Console.WriteLine(true); } } } IL .method private hidebysig static void Main() cil managed { .entrypoint // Code size 21 (0x15) .maxstack 8 IL_0000: ldc.i4.3 IL_0001: stsfld int32 Program::_a IL_0006: ldsfld int32 Program::_a IL_000b: ldc.i4.4 IL_000c: bne.un.s IL_0014 IL_000e: ldc.i4.1 IL_000f: call void [mscorlib]System.Console::WriteLine(bool) IL_0014: ret } // end of method Program::Main
Switch. The switch instruction is implemented with a jump table. The switch keyword—in the C# language—is typically compiled to a switch instruction.

IL: In the intermediate language, we see the instruction switch (IL_0020, IL_0027, IL_002e).

Note: Those three identifiers point to lines, which are underlined. So the switch jumps to locations.

C# program that uses switch statement using System; class Program { static void Main() { int i = int.Parse("1"); switch (i) { case 0: Console.WriteLine(0); break; case 1: Console.WriteLine(1); break; case 2: Console.WriteLine(2); break; } } } Intermediate language .method private hidebysig static void Main() cil managed { .entrypoint // Code size 53 (0x35) .maxstack 1 .locals init ([0] int32 i, [1] int32 CS$0$0000) IL_0000: ldstr "1" IL_0005: call int32 [mscorlib]System.Int32::Parse(string) IL_000a: stloc.0 IL_000b: ldloc.0 IL_000c: stloc.1 IL_000d: ldloc.1 IL_000e: switch ( IL_0020, IL_0027, IL_002e) IL_001f: ret IL_0020: ldc.i4.0 IL_0021: call void [mscorlib]System.Console::WriteLine(int32) IL_0026: ret IL_0027: ldc.i4.1 IL_0028: call void [mscorlib]System.Console::WriteLine(int32) IL_002d: ret IL_002e: ldc.i4.2 IL_002f: call void [mscorlib]System.Console::WriteLine(int32) IL_0034: ret } // end of method Program::Main
A summary. The intermediate language is not needed every day. It usually goes without notice. But it opens up a new way of understanding what C# code does.
© TheDeveloperBlog.com
The Dev Codes

Related Links:


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf