How the .NET Runtime loads a Type

It is something we take for granted every time we run a .NET program, but it turns out that loading a Type or class is a fairly complex process.

So how does the .NET Runtime (CLR) actually load a Type?

If you want the tl;dr it’s done carefully, cautiously and step-by-step

Ensuring Type Safety

One of the key requirements of a ‘Managed Runtime’ is providing Type Safety, but what does it actually mean? From the MSDN page on Type Safety and Security

Type-safe code accesses only the memory locations it is authorized to access. (For this discussion, type safety specifically refers to memory type safety and should not be confused with type safety in a broader respect.) For example, type-safe code cannot read values from another object’s private fields. It accesses types only in well-defined, allowable ways.

So in effect, the CLR has to ensure your Types/Classes are well-behaved and following the rules.

Compiler prevents you from creating an ‘abstract’ class

But lets look at a more concrete example, using the C# code below

public abstract class AbstractClass { public AbstractClass () { } } public class NormalClass : AbstractClass { public NormalClass () { } } public static void Main ( string [] args ) { var test = new AbstractClass (); }

The compiler quite rightly refuses to compile this and gives the following error, because abstract classes can’t be created, you can only inherit from them.

error CS0144: Cannot create an instance of the abstract class or interface 'ConsoleApplication.AbstractClass'

So that’s all well and good, but the CLR can’t rely on all code being created via a well-behaved compiler, or in fact via a compiler at all. So it has to check for and prevent any attempt to create an abstract class.

Writing IL code by hand

One way to circumvent the compiler is to write IL code by hand using the IL Assembler tool (ILAsm) which will do almost no checks on the validity of the IL you give it.

For instance the IL below is the equivalent of writing var test = new AbstractClass(); (if the C# compiler would let us):

.method public hidebysig static void Main(string[] args) cil managed { .entrypoint .maxstack 1 .locals init ( [0] class ConsoleApplication.NormalClass class2) // System.InvalidOperationException: Instances of abstract classes cannot be created. newobj instance void ConsoleApplication.AbstractClass::.ctor() stloc.0 ldloc.0 callvirt instance class [mscorlib]System.Type [mscorlib]System.Object::GetType() callvirt instance string [mscorlib]System.Reflection.MemberInfo::get_Name() call void [mscorlib]Internal.Console::WriteLine(string) ret }

Fortunately the CLR has got this covered and will throw an InvalidOperationException when you execute the code. This is due to this check which is hit when the JIT compiles the newobj IL instruction.

Creating Types at run-time

One other way that you can attempt to create an abstract class is at run-time, using reflection (thanks to this blog post for giving me some tips on other ways of creating Types).

This is shown in the code below:

var abstractType = Type . GetType ( "ConsoleApplication.AbstractClass" ); Console . WriteLine ( abstractType . FullName ); // System.MissingMethodException: Cannot create an abstract class. var abstractInstance = Activator . CreateInstance ( abstractType );

The compiler is completely happy with this, it doesn’t do anything to prevent or warn you and nor should it. However when you run the code, it will throw an exception, strangely enough a MissingMethodException this time, but it does the job!

The call stack is below:

One final way (unless I’ve missed some out?) is to use GetUninitializedObject(..) in the FormatterServices class like so:

public static object CreateInstance ( Type type ) { var constructor = type . GetConstructor ( new Type [ 0 ]); if ( constructor == null && ! type . IsValueType ) { throw new NotSupportedException ( "Type '" + type . FullName + "' doesn't have a parameterless constructor" ); } var emptyInstance = FormatterServices . GetUninitializedObject ( type ); if ( constructor == null ) return null ; return constructor . Invoke ( emptyInstance , new object [ 0 ]) ?? emptyInstance ; } var abstractType = Type . GetType ( "ConsoleApplication.AbstractClass" ); Console . WriteLine ( abstractType . FullName ); // System.MemberAccessException: Cannot create an abstract class. var abstractInstance = CreateInstance ( abstractType );

Again the run-time stops you from doing this, however this time it decides to throw a MemberAccessException ?

This happens via the following call stack:

Further Type-Safety Checks

These checks are just one example of what the runtime has to validate when creating types, there are many more things is has to deal with. For instance you can’t:

Loading Types ‘step-by-step’

So we’ve seen that the CLR has to do multiple checks when it’s loading types, but why does it have to load them ‘step-by-step’?

Well in a nutshell, it’s because of circular references and recursion, particularly when dealing with generics types. If we take the code below from section ‘2.1 Load Levels’ in Type Loader Design (BotR):

classA < T > : C < B < T >> { } classB < T > : C < A < T >> { } classC < T > { }

These are valid types and class A depends on class B and vice versa. So we can’t load A until we know that B is valid, but we can’t load B , until we’re sure that A is valid, a classic deadlock!!

How does the run-time get round this, well from the same BotR page:

The loader initially creates the structure(s) representing the type and initializes them with data that can be obtained without loading other types. When this “no-dependencies” work is done, the structure(s) can be referred from other places, usually by sticking pointers to them into another structures. After that the loader progresses in incremental steps and fills the structure(s) with more and more information until it finally arrives at a fully loaded type. In the above example, the base types of A and B will be approximated by something that does not include the other type, and substituted by the real thing later.

(there is also some more info here)

So it loads types in stages, step-by-step, ensuring each dependant type has reached the same stage before continuing. These ‘Class Load’ stages are shown in the image below and explained in detail in this very helpful source-code comment (Yay for Open-Sourcing the CoreCLR!!)

The different levels are handled in the ClassLoader::DoIncrementalLoad(..) method, which contains the switch statement that deals with them all in turn.

However this is part of a bigger process, which controls loading an entire file, also known as a Module or Assembly in .NET terminology. The entire process for that is handled in by another dispatch loop (switch statement), that works with the FileLoadLevel enum (definition). So in reality the whole process for loading an Assembly looks like this (the loading of one or more Types happens as sub-steps once the Module had reached the FILE_LOADED stage)

We can see this in action if we build a Debug version of the CoreCLR and enable the relevant configuration knobs. For a simple ‘Hello World’ program we get the log output shown below, where LOADER: messages correspond to FILE_LOAD_XXX stages and PHASEDLOAD: messages indicate which CLASS_LOAD_XXX step we are on.

You can also see some of the other events that happen at the same time, these include creation of static variables ( STATICS: ), thread-statics ( THREAD STATICS: ) and PreStubWorker which indicates methods being prepared for the JITter.

------------------------------------------------------------------------------------------------------- This is NOT the full output, it's only the parts that reference 'Program.exe' and it's modules/classses ------------------------------------------------------------------------------------------------------- PEImage: Opened HMODULE C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe StoreFile: Add cached entry (000007FE65174540) with PEFile 000000000040D6E0 Assembly C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe: bits=0x2 LOADER: 439e30:***Program* >>>Load initiated, LOADED/LOADED LOADER: 0000000000439E30:***Program* loading at level BEGIN LOADER: 0000000000439E30:***Program* loading at level FIND_NATIVE_IMAGE LOADER: 0000000000439E30:***Program* loading at level VERIFY_NATIVE_IMAGE_DEPENDENCIES LOADER: 0000000000439E30:***Program* loading at level ALLOCATE STATICS: Allocating statics for module Program Loaded pModule: "C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe". Module Program: bits=0x2 STATICS: Allocating 72 bytes for precomputed statics in module C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe in LoaderAllocator 000000000043AA18 StoreFile (StoreAssembly): Add cached entry (000007FE65174F28) with PEFile 000000000040D6E0Completed Load Level ALLOCATE for DomainFile 000000000040D8C0 in AD 1 - success = 1 LOADER: 0000000000439E30:***Program* loading at level ADD_DEPENDENCIES Completed Load Level ADD_DEPENDENCIES for DomainFile 000000000040D8C0 in AD 1 - success = 1 LOADER: 0000000000439E30:***Program* loading at level PRE_LOADLIBRARY LOADER: 0000000000439E30:***Program* loading at level LOADLIBRARY LOADER: 0000000000439E30:***Program* loading at level POST_LOADLIBRARY LOADER: 0000000000439E30:***Program* loading at level EAGER_FIXUPS LOADER: 0000000000439E30:***Program* loading at level VTABLE FIXUPS LOADER: 0000000000439E30:***Program* loading at level DELIVER_EVENTS DRCT::IsReady - wait(0x100)=258, GetLastError() = 42424 DRCT::IsReady - wait(0x100)=258, GetLastError() = 42424 D::LA: Load Assembly Asy:0x000000000040D8C0 AD:0x0000000000439E30 which:C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe Completed Load Level DELIVER_EVENTS for DomainFile 000000000040D8C0 in AD 1 - success = 1 LOADER: 0000000000439E30:***Program* loading at level LOADED Completed Load Level LOADED for DomainFile 000000000040D8C0 in AD 1 - success = 1 LOADER: 439e30:***Program* <<<Load completed, LOADED In PreStubWorker for System.Environment::SetCommandLineArgs Prestubworker: method 000007FEC2AE1160M DoRunClassInit: Request to init 000007FEC3BACCF8T in appdomain 0000000000439E30 RunClassInit: Calling class contructor for type 000007FEC3BACCF8T In PreStubWorker for System.Environment::.cctor Prestubworker: method 000007FEC2AE1B10M DoRunClassInit: Request to init 000007FEC3BACCF8T in appdomain 0000000000439E30 DoRunClassInit: returning SUCCESS for init 000007FEC3BACCF8T in appdomain 0000000000439E30 RunClassInit: Returned Successfully from class contructor for type 000007FEC3BACCF8T DoRunClassInit: returning SUCCESS for init 000007FEC3BACCF8T in appdomain 0000000000439E30 PHASEDLOAD: LoadTypeHandleForTypeKey for type ConsoleApplication.Program to level LOADED PHASEDLOAD: table contains: LoadTypeHandle: Loading Class from Module 000007FE65174718 token 2000002 PHASEDLOAD: Creating loading entry for type ConsoleApplication.Program PHASEDLOAD: About to do incremental load of type ConsoleApplication.Program (0000000000000000) from level BEGIN Looking up System.Object by name. Loading class "ConsoleApplication.Program" from module "C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe" in domain 0x0000000000439E30 SD: MT::MethodIterator created for System.Object. EEC::IMD: pNewMD:0x65175178 for tok:0x6000001 (ConsoleApplication.Program::.cctor) EEC::IMD: pNewMD:0x651751a8 for tok:0x6000002 (ConsoleApplication.Program::.ctor) EEC::IMD: pNewMD:0x651751d8 for tok:0x6000003 (ConsoleApplication.Program::Main) STATICS: Placing statics for ConsoleApplication.Program STATICS: Field placed at non GC offset 0x38 Offset of staticCounter1: 56 STATICS: Field placed at non GC offset 0x40 Offset of staticCounter2: 64 STATICS: Static field bytes needed (0 is normal for non dynamic case)0 STATICS: Placing ThreadStatics for ConsoleApplication.Program THREAD STATICS: Field placed at non GC offset 0x20 Offset of threadStaticCounter1: 32 THREAD STATICS: Field placed at non GC offset 0x28 Offset of threadStaticCounter2: 40 STATICS: ThreadStatic field bytes needed (0 is normal for non dynamic case)0 CLASSLOADER: AppDomainAgileAttribute for ConsoleApplication.Program is 0 MethodTableBuilder: finished method table for module 000007FE65174718 token 2000002 = 000007FE65175230T PHASEDLOAD: About to do incremental load of type ConsoleApplication.Program (000007FE65175230) from level APPROXPARENTS Notify: 000007FE65175230 ConsoleApplication.Program Successfully loaded class ConsoleApplication.Program PHASEDLOAD: Completed full dependency load of type (000007FE65175230)+ConsoleApplication.Program PHASEDLOAD: Completed full dependency load of type (000007FE65175230)+ConsoleApplication.Program LOADER: 439e30:***Program* >>>Load initiated, ACTIVE/ACTIVE LOADER: 0000000000439E30:***Program* loading at level VERIFY_EXECUTION LOADER: 0000000000439E30:***Program* loading at level ACTIVE Completed Load Level ACTIVE for DomainFile 000000000040D8C0 in AD 1 - success = 1 LOADER: 439e30:***Program* <<<Load completed, ACTIVE In PreStubWorker for ConsoleApplication.Program::Main Prestubworker: method 000007FE651751D8M In PreStubWorker, calling MakeJitWorker CallCompileMethodWithSEHWrapper called... D::gV: cVars=0, extendOthers=1 Looking up System.Console by name. SD: MT::MethodIterator created for System.Console. JitComplete completed successfully Got through CallCompile MethodWithSEHWrapper MethodDesc::MakeJitWorker finished. Stub is 000007fe`652d0480 DoRunClassInit: Request to init 000007FE65175230T in appdomain 0000000000439E30 RunClassInit: Calling class contructor for type 000007FE65175230T In PreStubWorker for ConsoleApplication.Program::.cctor Prestubworker: method 000007FE65175178M In PreStubWorker, calling MakeJitWorker CallCompileMethodWithSEHWrapper called... D::gV: cVars=0, extendOthers=1 JitComplete completed successfully Got through CallCompile MethodWithSEHWrapper MethodDesc::MakeJitWorker finished. Stub is 000007fe`652d04c0

So there you have it, the CLR loads your classes/Types carefully, cautiously and step-by-step!!

Discuss this post on HackerNews and /r/programming

As always, here’s some more links if you’d like to find out further information: