Intro

Let's speak about code optimization. It is a real fun making the code run faster. Just imagine: you take a cup of tea, sit down in front of a code block and speed it up.

How many things in this world can be more satisfying?

Recently I spent a good bit of time optimizing our customer's code. In that code we found a simple loop:

xFilters.ToList().ForEach(n => { XmlNodeList nodesToDelete = xml.SelectNodes(BuildXPath(n.EntityName)); foreach (XmlNode node in nodesToDelete) node.ParentNode.RemoveChild(node); });

My colleague told me something like:

Let's change .ForEach() to regular foreach. Not a big optimization but foreach should be faster.

Surprisingly the method began to work slower after our "optimization"!

So after a couple minutes of thinking I've built a little sample console application to test this behavior in depth.

Stop! Show me the code!

First of all lets create a simple test class to create objects that we will store in our collections.

public class User { public string Name { get; set; } public Tuple<int, string> Info { get; set; } public User(string name, int age, string hobby) { this.Name = name; this.Info = new Tuple<int, string>(age, hobby); } }

static double ArrayForeach(User[] array) { int sum = 0; var watch = Stopwatch.StartNew(); foreach (var item in array) { sum += item.Info.Item1; } watch.Stop(); return watch.Elapsed.TotalMilliseconds; } static double ArrayFor(User[] array) { int sum = 0; var watch = Stopwatch.StartNew(); int arrLength = array.Length; for (int i = 0; i < arrLength; i++) { sum += array[i].Info.Item1; } watch.Stop(); return watch.Elapsed.TotalMilliseconds; } static double ArrayForEach(User[] array) { int sum = 0; var watch = Stopwatch.StartNew(); Array.ForEach(array, i => { sum += i.Info.Item1; }); watch.Stop(); return watch.Elapsed.TotalMilliseconds; } static double ListForeach(List<User> list) { int sum = 0; var watch = Stopwatch.StartNew(); foreach (var item in list) { sum += item.Info.Item1; } watch.Stop(); return watch.Elapsed.TotalMilliseconds; } static double ListForEach(List<User> list) { int sum = 0; var watch = Stopwatch.StartNew(); list.ForEach(i => { sum += i.Info.Item1; }); watch.Stop(); return watch.Elapsed.TotalMilliseconds; } static double ListFor(List<User> list) { int sum = 0; var watch = Stopwatch.StartNew(); int length = list.Count; for (int i = 0; i < length; i++) { sum += list[i].Info.Item1; } watch.Stop(); return watch.Elapsed.TotalMilliseconds; }

So we need to make effective time calculation. That's why we should call all methods in cycles, calculate average and print the result after.

int cycleCount = 1000; double sum = 0.0; for (int i = 0; i < cycleCount; i++) { sum += ArrayForeach(arr); } arrayAverageForeachTime = sum / cycleCount;

And same code for all other methods.

Result

Lets make two charts. One for array and one for list.

Array

List

Array's for became faster only on really big amount of elements. But for list we can see very interesting result.

For looping is faster than other variants as expected. But it was really surprise for me that instance .ForEach is much faster than usual foreach loop!

Ok. But why?

Let's look at IL code of foreach loop in ListForEach method.

IL_0009: callvirt instance valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<!0> class [mscorlib]System.Collections.Generic.List`1<class ConsoleApplication1.User>::GetEnumerator() IL_000e: stloc.3 .try { IL_000f: br.s IL_0027 IL_0011: ldloca.s CS$5$0000 IL_0013: call instance !0 valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<class ConsoleApplication1.User>::get_Current() IL_0018: stloc.2 IL_0019: ldloc.0 IL_001a: ldloc.2 IL_001b: callvirt instance class [mscorlib]System.Tuple`2<int32,string> ConsoleApplication1.User::get_Info() IL_0020: callvirt instance !0 class [mscorlib]System.Tuple`2<int32,string>::get_Item1() IL_0025: add IL_0026: stloc.0 IL_0027: ldloca.s CS$5$0000 IL_0029: call instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<class ConsoleApplication1.User>::MoveNext() IL_002e: brtrue.s IL_0011 IL_0030: leave.s IL_0040 } // end .try finally { IL_0032: ldloca.s CS$5$0000 IL_0034: constrained. valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<class ConsoleApplication1.User> IL_003a: callvirt instance void [mscorlib]System.IDisposable::Dispose() IL_003f: endfinally } // end handler

As we can see on each iteration we have 2 method calls: Enumerator<class ConsoleApplication1.User>::get_Current( ) and Enumerator<class ConsoleApplication1.User>::MoveNext()

Now lets look how list's ForEach work:

.method public hidebysig instance void ForEach(class System.Action`1<!T> action) cil managed { .custom instance void __DynamicallyInvokableAttribute::.ctor() = ( 01 00 00 00 ) // Code size 91 (0x5b) .maxstack 3 .locals init (int32 V_0, int32 V_1) IL_0000: ldarg.1 IL_0001: brtrue.s IL_0009 IL_0003: ldc.i4.8 IL_0004: call void System.ThrowHelper::ThrowArgumentNullException(valuetype System.ExceptionArgument) IL_0009: ldarg.0 IL_000a: ldfld int32 class System.Collections.Generic.List`1<!T>::_version IL_000f: stloc.0 IL_0010: ldc.i4.0 IL_0011: stloc.1 IL_0012: br.s IL_003a IL_0014: ldloc.0 IL_0015: ldarg.0 IL_0016: ldfld int32 class System.Collections.Generic.List`1<!T>::_version IL_001b: beq.s IL_0024 IL_001d: call bool System.Runtime.Versioning.BinaryCompatibility::get_TargetsAtLeast_Desktop_V4_5() IL_0022: brtrue.s IL_0043 IL_0024: ldarg.1 IL_0025: ldarg.0 IL_0026: ldfld !0[] class System.Collections.Generic.List`1<!T>::_items IL_002b: ldloc.1 IL_002c: ldelem !T IL_0031: callvirt instance void class System.Action`1<!T>::Invoke(!0) IL_0036: ldloc.1 IL_0037: ldc.i4.1 IL_0038: add IL_0039: stloc.1 IL_003a: ldloc.1 IL_003b: ldarg.0 IL_003c: ldfld int32 class System.Collections.Generic.List`1<!T>::_size IL_0041: blt.s IL_0014 IL_0043: ldloc.0 IL_0044: ldarg.0 IL_0045: ldfld int32 class System.Collections.Generic.List`1<!T>::_version IL_004a: beq.s IL_005a IL_004c: call bool System.Runtime.Versioning.BinaryCompatibility::get_TargetsAtLeast_Desktop_V4_5() IL_0051: brfalse.s IL_005a IL_0053: ldc.i4.s 32 IL_0055: call void System.ThrowHelper::ThrowInvalidOperationException(valuetype System.ExceptionResource) IL_005a: ret } // end of method List`1::ForEach

Also decompiled code of ForEach (used Jetbrains DotPeek 2016.2.2):

public void ForEach(Action<T> action) { if( action == null) { ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match); } Contract.EndContractBlock(); int version = _version; for(int i = 0 ; i < _size; i++) { if (version != _version && BinaryCompatibility.TargetsAtLeast_Desktop_V4_5) { break; } action(_items[i]); } if (version != _version && BinaryCompatibility.TargetsAtLeast_Desktop_V4_5) ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion); }

As we can see, there is only one method call: instance void class System.Action1<!T>::Invoke(!0) or action(_items[i]); from decompiled code. It is just regular for loop with one call of our lambda method.

P.S

Micro-optimization is really good way to understand how your code and platform work. Performance is important, and you should always test your apps and looking for the ways to improve it. But keep in mind that in most cases code readability and maintainability is more important then 50ms speed improvements. Don’t sacrifice application maintainability for performance when you don’t actually need it. And if you see that your colleague doesn't use LINQ or foreach because they are a little bit slower, and writes a big piece of code instead of them, please stop them.