Here is benchmark for .NET Native and JIT method call dispatch.

This benchmark was created to measure reflection speed in .NET Native and JIT-ed Runtime and to measure method call speed on different builds. It's a common knowledge that reflection is much slower than direct calls, but to measure exact numbers this benchmark was created. Benchmark was compiled with latest .NET Core runtime and latest .NET Native toolchain.

Some words about tested scenarios.

Baseline scenario is a simple direct method call:

/// <summary> /// Simple call scenario. /// </summary> public sealed class SimpleCallScenario : IScenario { public string ScenarioName => "Direct call"; public async Task<BenchmarkResult> DoBenchmark() { var task = Task.Factory.StartNew(RunBenchmark); return await task; } private BenchmarkResult RunBenchmark() { var callable = new CallableClass(); var ticks1 = Environment.TickCount; for (var i = 0; i < Consts.RunCount*100; i++) { callable.Run(); callable.RunWithArgs(i, ""); } var ticks2 = Environment.TickCount; return new BenchmarkResult() { RunCount = Consts.RunCount*100, Milliseconds = ticks2 - ticks1}; } }

Next scenario is a call via cast object to interface in runtime.

/// <summary> /// Interface cast call scenario. /// </summary> public sealed class InterfaceCallScenario : IScenario { public string ScenarioName => "Interface cast call"; public async Task<BenchmarkResult> DoBenchmark() { var task = Task.Factory.StartNew(RunBenchmark); return await task; } private BenchmarkResult RunBenchmark() { ICallableInterface callable = new CallableClass(); var ticks1 = Environment.TickCount; for (var i = 0; i < Consts.RunCount*100; i++) { CallInterface(callable, i); } var ticks2 = Environment.TickCount; return new BenchmarkResult() { RunCount = Consts.RunCount*100, Milliseconds = ticks2 - ticks1 }; } private void CallInterface(object callableObj, int i) { var callable = callableObj as ICallableInterface; callable?.Run(); callable?.RunWithArgs(i, ""); } }

Next scenario is a naked Invoke call via reflection API

/// <summary> /// Reflection call scenario. /// </summary> public sealed class ReflectionCallScenario : IScenario { public string ScenarioName => "Reflection invoke call"; public async Task<BenchmarkResult> DoBenchmark() { var task = Task.Factory.StartNew(RunBenchmark); return await task; } private BenchmarkResult RunBenchmark() { ICallableInterface callable = new CallableClass(); var ticks1 = Environment.TickCount; for (var i = 0; i < Consts.RunCount; i++) { CallReflection(callable, i); } var ticks2 = Environment.TickCount; return new BenchmarkResult() { RunCount = Consts.RunCount, Milliseconds = ticks2 - ticks1 }; } private static readonly TypeInfo InterfaceTypeInfo = typeof (ICallableInterface).GetTypeInfo(); private static readonly MethodInfo RunTypeInfo = InterfaceTypeInfo.GetDeclaredMethod(nameof(ICallableInterface.Run)); private static readonly MethodInfo RunWithArgsTypeInfo = InterfaceTypeInfo.GetDeclaredMethod(nameof(ICallableInterface.RunWithArgs)); private void CallReflection(object callableObj, int i) { RunTypeInfo.Invoke(callableObj, new object[0]); RunWithArgsTypeInfo.Invoke(callableObj, new object[] {i, ""}); } }

Next scenario is a cached reflection call. Instead of plain Invoke, CreateDelegate used to cache delegate and then this cached delegate is called.

/// <summary> /// Reflection delegate call scenario. /// </summary> public sealed class ReflectionDelegateCallScenario : IScenario { public string ScenarioName => "Reflection cached delegate invoke call"; public async Task<BenchmarkResult> DoBenchmark() { var task = Task.Factory.StartNew(RunBenchmark); return await task; } private BenchmarkResult RunBenchmark() { ICallableInterface callable = new CallableClass(); var ticks1 = Environment.TickCount; Func<int> cache1 = null; Func<int, string, int> cache2 = null; for (var i = 0; i < Consts.RunCount; i++) { CallReflection(callable, i, ref cache1, ref cache2); } var ticks2 = Environment.TickCount; return new BenchmarkResult() { RunCount = Consts.RunCount, Milliseconds = ticks2 - ticks1 }; } private static readonly TypeInfo InterfaceTypeInfo = typeof(ICallableInterface).GetTypeInfo(); private static readonly MethodInfo RunTypeInfo = InterfaceTypeInfo.GetDeclaredMethod(nameof(ICallableInterface.Run)); private static readonly MethodInfo RunWithArgsTypeInfo = InterfaceTypeInfo.GetDeclaredMethod(nameof(ICallableInterface.RunWithArgs)); private void CallReflection(object callableObj, int i, ref Func<int> cache1, ref Func<int, string, int> cache2) { if (cache1 == null) { cache1 = RunTypeInfo.CreateDelegate(typeof (Func<int>), callableObj) as Func<int>; } if (cache2 == null) { cache2 = RunWithArgsTypeInfo.CreateDelegate(typeof(Func<int, string, int>), callableObj) as Func<int, string, int>; } cache1(); cache2(i, ""); } }

Next scenario is a method invoke using C# dynamic keyword.

/// <summary> /// Dynamic call scenario. /// </summary> public sealed class DynamicCallScenario : IScenario { public string ScenarioName => "Dynamic invoke call"; public async Task<BenchmarkResult> DoBenchmark() { var task = Task.Factory.StartNew(RunBenchmark); return await task; } private BenchmarkResult RunBenchmark() { ICallableInterface callable = new CallableClass(); var ticks1 = Environment.TickCount; for (var i = 0; i < Consts.RunCount; i++) { CallDynamic(callable, i); } var ticks2 = Environment.TickCount; return new BenchmarkResult() { RunCount = Consts.RunCount, Milliseconds = ticks2 - ticks1 }; } private void CallDynamic(dynamic callableObj, int i) { callableObj.Run(); callableObj.RunWithArgs(i, ""); } }

Next two scenarios is a simulation of custom-generated proxies used for method dispatch. These scenarios was made to measure statically generated by some tool proxies to avoid reflection.

/// <summary> /// Dispatch call scenario. /// </summary> public abstract class DispatchCallScenarioBase : IScenario { public abstract string ScenarioName { get; } public async Task<BenchmarkResult> DoBenchmark() { var task = Task.Factory.StartNew(RunBenchmark); return await task; } /// <summary> /// Create dispatch proxy. /// </summary> /// <param name="callable">Callable.</param> /// <returns>Proxy/</returns> protected abstract IDispatcherProxy CreateProxy(ICallableInterface callable); private BenchmarkResult RunBenchmark() { var proxy = CreateProxy(new CallableClass()); var ticks1 = Environment.TickCount; for (var i = 0; i < Consts.RunCount; i++) { CallProxy(proxy, i); } var ticks2 = Environment.TickCount; return new BenchmarkResult() { RunCount = Consts.RunCount, Milliseconds = ticks2 - ticks1 }; } private void CallProxy(IDispatcherProxy proxy, int i) { proxy.DispatchCall(nameof(ICallableInterface.Run)); proxy.DispatchCall(nameof(ICallableInterface.RunWithArgs), i, ""); } }

And so, first proxy is a proxy with switch statement inside

/// <summary> /// Callable dispatcher using switch statement. /// </summary> public sealed class CallableCaseDispatcher : IDispatcherProxy { private readonly ICallableInterface _callable; /// <summary> /// Constructor. /// </summary> /// <param name="callable">Callable interface.</param> public CallableCaseDispatcher(ICallableInterface callable) { if (callable == null) throw new ArgumentNullException(nameof(callable)); _callable = callable; } /// <summary> /// Dynamic dispatch call. /// </summary> /// <param name="methodName">Method name.</param> /// <param name="arguments">Call arguments.</param> /// <returns>Call result.</returns> public object DispatchCall(string methodName, params object[] arguments) { switch (methodName) { case nameof(ICallableInterface.Run): if (arguments != null && arguments.Length > 0) { throw new ArgumentException(); } return _callable.Run(); case nameof(ICallableInterface.RunWithArgs): if (arguments == null || arguments.Length != 2) { throw new ArgumentException(); } return _callable.RunWithArgs((int) arguments[0], (string) arguments[1]); default: throw new NotImplementedException(); } } }

...and second proxy type is a proxy which use internal Dictionary to choose method at runtime.

/// <summary> /// Callable dispatcher using dictionary. /// </summary> public class CallableDictionaryDispatcher : IDispatcherProxy { private readonly ICallableInterface _callable; private readonly Dictionary<string, Func<object[], object>> _methods = new Dictionary<string, Func<object[], object>>(); /// <summary> /// Constructor. /// </summary> /// <param name="callable">Callable interface.</param> public CallableDictionaryDispatcher(ICallableInterface callable) { if (callable == null) throw new ArgumentNullException(nameof(callable)); _callable = callable; _methods[nameof(ICallableInterface.Run)] = RunDispatch; _methods[nameof(ICallableInterface.RunWithArgs)] = RunWithArgsDispatch; } private object RunDispatch(object[] arguments) { if (arguments != null && arguments.Length > 0) { throw new ArgumentException(); } return _callable.Run(); } private object RunWithArgsDispatch(object[] arguments) { if (arguments == null || arguments.Length != 2) { throw new ArgumentException(); } return _callable.RunWithArgs((int)arguments[0], (string)arguments[1]); } /// <summary> /// Dynamic dispatch call. /// </summary> /// <param name="methodName">Method name.</param> /// <param name="arguments">Call arguments.</param> /// <returns>Call result.</returns> public object DispatchCall(string methodName, params object[] arguments) { if (_methods.ContainsKey(methodName)) { return _methods[methodName](arguments); } throw new NotImplementedException(); } }

So, it's time to show some benchmark results

"Release" are measured builds compiled with .NET Native

"Debug" are measured builds compiled without .NET Native, JIT-ed builds

******* x64 Release *******

Scenario set "Dynamic calls"

=======================================================

Direct call: run count = 100000000, totalTime(ms) = 203, time per run(microsec) = 0.0020

Interface cast call: run count = 100000000, totalTime(ms) = 344, time per run(microsec) = 0.0034, % to baseline = 169.46%

Reflection invoke call: run count = 1000000, totalTime(ms) = 297, time per run(microsec) = 0.2970, % to baseline = 14630.54%

Reflection cached delegate invoke call: run count = 1000000, totalTime(ms) = 15, time per run(microsec) = 0.0150, % to baseline = 738.92%

Dynamic invoke call: run count = 1000000, totalTime(ms) = 6125, time per run(microsec) = 6.1250, % to baseline = 301724.14%

Switch dispatch call: run count = 1000000, totalTime(ms) = 63, time per run(microsec) = 0.0630, % to baseline = 3103.45%

Dictionary dispatch call: run count = 1000000, totalTime(ms) = 125, time per run(microsec) = 0.1250, % to baseline = 6157.64%

******* x86 Release *******

Scenario set "Dynamic calls"

=======================================================

Direct call: run count = 100000000, totalTime(ms) = 343, time per run(microsec) = 0.0034

Interface cast call: run count = 100000000, totalTime(ms) = 375, time per run(microsec) = 0.0038, % to baseline = 109.33%

Reflection invoke call: run count = 1000000, totalTime(ms) = 282, time per run(microsec) = 0.2820, % to baseline = 8221.57%

Reflection cached delegate invoke call: run count = 1000000, totalTime(ms) = 15, time per run(microsec) = 0.0150, % to baseline = 437.32%

Dynamic invoke call: run count = 1000000, totalTime(ms) = 5328, time per run(microsec) = 5.3280, % to baseline = 155335.28%

Switch dispatch call: run count = 1000000, totalTime(ms) = 47, time per run(microsec) = 0.0470, % to baseline = 1370.26%

Dictionary dispatch call: run count = 1000000, totalTime(ms) = 125, time per run(microsec) = 0.1250, % to baseline = 3644.31%

******* x64 Debug *******

Scenario set "Dynamic calls"

=======================================================

Direct call: run count = 100000000, totalTime(ms) = 828, time per run(microsec) = 0.0083

Interface cast call: run count = 100000000, totalTime(ms) = 2156, time per run(microsec) = 0.0216, % to baseline = 260.39%

Reflection invoke call: run count = 1000000, totalTime(ms) = 1234, time per run(microsec) = 1.2340, % to baseline = 14903.38%

Reflection cached delegate invoke call: run count = 1000000, totalTime(ms) = 31, time per run(microsec) = 0.0310, % to baseline = 374.40%

Dynamic invoke call: run count = 1000000, totalTime(ms) = 2782, time per run(microsec) = 2.7820, % to baseline = 33599.03%

Switch dispatch call: run count = 1000000, totalTime(ms) = 93, time per run(microsec) = 0.0930, % to baseline = 1123.19%

Dictionary dispatch call: run count = 1000000, totalTime(ms) = 250, time per run(microsec) = 0.2500, % to baseline = 3019.32%

******* x86 Debug *******

Scenario set "Dynamic calls"

=======================================================

Direct call: run count = 100000000, totalTime(ms) = 813, time per run(microsec) = 0.0081

Interface cast call: run count = 100000000, totalTime(ms) = 1407, time per run(microsec) = 0.0141, % to baseline = 173.06%

Reflection invoke call: run count = 1000000, totalTime(ms) = 1375, time per run(microsec) = 1.3750, % to baseline = 16912.67%

Reflection cached delegate invoke call: run count = 1000000, totalTime(ms) = 15, time per run(microsec) = 0.0150, % to baseline = 184.50%

Dynamic invoke call: run count = 1000000, totalTime(ms) = 2734, time per run(microsec) = 2.7340, % to baseline = 33628.54%

Switch dispatch call: run count = 1000000, totalTime(ms) = 94, time per run(microsec) = 0.0940, % to baseline = 1156.21%

Dictionary dispatch call: run count = 1000000, totalTime(ms) = 218, time per run(microsec) = 0.2180, % to baseline = 2681.43%

So, let's make some conclusions.

Direct method call in release build take 0.0020 microseconds. Direct call in debug (JIT-ed) build take 0.0083 microseconds. .NET Native toolchain make even simple calls 4x faster. Casting object to interface is almost free on .NET Native builds (1.6x on x64 release) and slightly worse on JIT-ed builds (2.6x on x64 debug). Reflection calls take 146x longer than direct calls in x64 release build and 82x in x86 release. In debug (JIT-ed) builds it is 149x and 169x longer. But if compare absolute numbers, reflection call in x64 release take 0.2970 microseconds, and reflection call in x64 debug mode take 1.2340 microseconds. So, .NET Native toolchain provide much better reflection speed than .NET Core runtime! It's a myth that "reflection in .NET Native performs worse than in .NET runtime". It's an absolutely false claim. Reflection speed in .NET Native compiled builds indeed much better than in JIT-ed debug builds. Cached to delegated reflection invoke provide 7.38x on x64 and 4.37x on x86. So, caching reflected method address to delegate improve reflection speed about 10x and cached reflection invoke time is very comparable to direct method invoke. It is safe to use "cached to delegate" reflection scheme and doesn't add noticeable impact on performance. Dynamic invoke (using dynamic C# keyword) is an absolute outsider. Calls via C# dynamics performs 3017x worse on x64 release! Use of "dynamic" C# keyword on .NET Native builds should be avoided at any cost! It is a complete performance disaster. But on JIT-ed builds dynamic performs comparably to non-cached reflection invoke. Regular reflection invoke on x64 debug take 149x and dynamic invoke take 336x. Switch and dictionary based "generated proxy" pattern does provide performance gain if compared to simple reflection invoke, but doesn't provide much gain if compared to cached reflection invoke. On x64 release build "switch dispatch" performs 31x, "dictionary dispatch" performs 61x, simple reflection invoke performs 146x, cached reflection invoke performs 7.3x. There is on reason to generate custom proxies by tools as cached reflection invoke provide much better invoke speed.

Major conclusion from this: cached reflection invoke is a best method to invoke class or interface methods dynamically. There is no reason to fear reflection in .NET Native compiled code. Reflection itself performs much better on .NET Native, and if reflection invokes are cached to delegates it is almost free.