Introduction

During the last few years, I've been asked many times why I bother exercising my x86 and x64 assembly language skills, and especially why I find assembly language important to teach at courses, conferences, and one-off sessions. After all, .NET developers are light years away from the actual assembly code generated for their applications, and surely there can't arise a need to write by hand any assembly code.

I agree entirely with the sentiment that you won't often have to write assembly code by hand, unless you are working on a very low-level optimization; also, there would be no way to invoke your assembly code directly from a .NET program. However, I believe that all .NET developers should be able to read assembly code, mostly for debugging purposes but also for profiling and performance optimization.

In this article, I will show you some examples of where understanding assembly code and general stack structure -- details usually shielded from .NET developers -- will help debug otherwise impossible problems, without any "advanced" tools and even without Visual Studio. However, I will have to make some assumptions. This article presupposes basic familiarity with x86 assembly language, stack structure, calling conventions, and also some understanding of WinDbg and SOS commands. There are some excellent resources on the web that you can use to catch up on these topics:

Analyze a Corrupted or Incomplete Call Stack

It doesn't happen often, but even managed applications experience a stack corruption from time to time. Here are some possible causes for a stack corruption:

Stack overflow -- because of an infinite recursion or a large repeated stack allocation

P/Invoke stack imbalance -- mismatch of the managed and unmanaged function signatures

Random memory corruption -- usually caused by an unmanaged component in the process

When a stack corruption occurs, it's often very difficult to determine the culprit because... the stack is corrupted! Any trace of what the application was doing at the time of the corruption might have been overwritten with garbage on the stack. In fact, even debugger commands -- such as !CLRStack -- might not work properly. What can you do when a stack corruption occurs? Naturally, the only thing remaining is to walk the stack manually.

First, let's assume that the stack pointer ( ESP ) has not been corrupted (whereas the EBP register may have been corrupted). In that case, we know where the stack begins, and can start scanning backwards for execution residue. Namely, most frames on the stack preserve the EBP register, making it possible to retrace execution by finding a pair {EBP, return address} and following the linked list of frames starting from EBP . Below is an analysis that follows these steps to reconstruct the stack:

0:000> !CLRStack OS Thread Id: 0x3318 (0) Child SP IP Call Site 00233000 00450818 FileExplorer.MainForm.RecursivelyFillTreeview (System.Windows.Forms.TreeNode, System.String)

There's just one frame on the stack, and even though it looks valid, clearly the stack did not begin at that method and we are missing more frames. It's time to try reconstructing the stack manually from ESP using the dds command, which dumps memory and tries to resolve symbols. Unfortunately, because the code is managed, we will not have any valid symbols on the stack without help from an SOS command, such as !U .

0:000> dds esp 00233000 00000000 00233004 00000000 00233008 00000000 0023300c 00000000 00233010 00000000 00233014 00000000 00233018 00000000 0023301c 00000000 00233020 0220e1ec 00233024 021e364c 00233028 00000000 0023302c 00000000 00233030 021e364c 00233034 00233058 00233038 0023307c 0023303c 00450826 00233040 021e513c 00233044 00000000 00233048 00000000 0023304c 00000000 00233050 00000000 00233054 00000000 00233058 00000000 0023305c 00000000 00233060 00000000 00233064 0220e1ec 00233068 021e364c 0023306c 00000000 00233070 00000000 00233074 021e364c 00233078 0023309c 0023307c 002330c0

The marked words on the stack look like an {EBP, return address} pair. Why am I saying this? Because the first value is sufficiently close to the value of ESP , which makes me confident that it points to the stack -- as EBP should -- and the second value is sufficiently far away from the stack -- indeed, it should be an executable code address. To verify that it's an address, let's use the !U command:

0:000> !u 00450826 Normal JIT generated code FileExplorer.MainForm.RecursivelyFillTreeview(System.Windows.Forms.TreeNode, System.String) Begin 004507d0, size f6 ...snipped...

Indeed, this looks like a valid method, and we can continue guessing. If our guess for EBP was right, it should point to another saved EBP , which should be followed by another return address, enabling us to retrace the stack in full:

0:000> dds 0023307c L2 0023307c 002330c0 00233080 00450826

Sure enough, the first value again looks like a valid saved EBP , and the second value is the exact same address as earlier, making it seem like a recursive function gone wild. We can repeat this procedure until we reach the top of the stack to obtain the entire call stack, which in this case would span hundreds of screens.

Another variation of stack corruptions worth mentioning is the situation where the ESP register is corrupt as well, and we can't trust it to point to the actual stack. This is less frequent in simple stack overflow scenarios, but might happen due to a buffer overflow, a random memory corruption, or a wild stack imbalance. In that case, we have to obtain the top of the stack by other means. Fortunately, every Windows thread has a data structure called Thread Environment Block (TEB) which contains the range of its stack, and the !teb debugger command can dump the current thread's TEB conveniently. Armed with this information, we can start walking the stack looking for {EBP, return address} pairs.

0:000> dt ntdll!_NT_TIB +0x000 ExceptionList : Ptr32 _EXCEPTION_REGISTRATION_RECORD +0x004 StackBase : Ptr32 Void +0x008 StackLimit : Ptr32 Void +0x00c SubSystemTib : Ptr32 Void +0x010 FiberData : Ptr32 Void +0x010 Version : Uint4B +0x014 ArbitraryUserPointer : Ptr32 Void +0x018 Self : Ptr32 _NT_TIB 0:000> !teb ...snipped...

Correlate Crash Location to Source Code Line

Often times, you are facing a crash dump with a relatively simple exception in it, and want to resolve the root cause to a specific line of code. Commands such as !CLRStack are renowned for not reporting source line information accurately, and if your method has hundreds of lines, finding the line of code that crashed might be akin to the famous needle in a haystack.

In cases like these, reading a little disassembly might be just the right thing to do. With help from the SOS !U command, you will have hints in the generated disassembly pointing you to various .NET methods or CLR helpers your code is using. Isolating the offending instruction and correlating it to a specific line of code will usually be quite simple. Let's tackle an example -- we have the following exception call stack:

0:005> !PrintException Exception object: 02c0fff0 Exception type: System.NullReferenceException Message: Object reference not set to an instance of an object. InnerException: <none> StackTrace (generated): SP IP Function 0530F370 00380A8A fileexplorer!FileExplorer.MainForm+<>c__DisplayClass1. <treeView1_AfterSelect>b__0(System.Object)+0x4a 0530F3AC 67A3C958 mscorlib_ni!System.Threading.QueueUserWorkItemCallback.WaitCallback_Context (System.Object)+0x3c 0530F3B4 67A20846 mscorlib_ni!System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)+0xe6 0530F3D8 67A3D872 mscorlib_ni!System.Threading.QueueUserWorkItemCallback. System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()+0x5a 0530F3EC 67A3D0A7 mscorlib_ni!System.Threading.ThreadPoolWorkQueue.Dispatch()+0x13f StackTraceString: <none> HResult: 80004003

The exception occurred in a strangely-named function, <>c__DisplayClass1.<treeView1_AfterSelect>b__0 . If you've had some experience with ILDASM, you might know that this is the kind of name the C# compiler gives anonymous methods (or lambdas). specifically, treeView1_AfterSelect is the method that contains the lambda we are looking at. But where inside the lambda did we crash? Source information is not available (perhaps we don't even have symbols for that frame), but we can inspect the disassembly at the faulting address:

0:005> !u 00380A8A Normal JIT generated code FileExplorer.MainForm+<>c__DisplayClass1.<treeView1_AfterSelect>b__0(System.Object) Begin 00380a40, size e1 ...snipped... 00380a6c 33d2 xor edx,edx 00380a6e 8955f0 mov dword ptr [ebp-10h],edx 00380a71 33d2 xor edx,edx 00380a73 8955dc mov dword ptr [ebp-24h],edx 00380a76 33d2 xor edx,edx 00380a78 8955e0 mov dword ptr [ebp-20h],edx 00380a7b c745ec00000000 mov dword ptr [ebp-14h],0 00380a82 90 nop 00380a83 90 nop 00380a84 8b45e4 mov eax,dword ptr [ebp-1Ch] 00380a87 8b4804 mov ecx,dword ptr [eax+4] >>> 00380a8a 3909 cmp dword ptr [ecx],ecx 00380a8c e82fa8637a call System_Windows_Forms_ni+0x15b2c0 (7a9bb2c0) (System.Windows.Forms.TreeNode.get_Name(), mdToken: 06004a49) 00380a91 8945d8 mov dword ptr [ebp-28h],eax 00380a94 8b4dd8 mov ecx,dword ptr [ebp-28h] 00380a97 e8acee6767 call mscorlib_ni+0x28f948 (679ff948) (System.IO.Directory.GetFiles(System.String), mdToken: 06004245) 00380a9c 8945d4 mov dword ptr [ebp-2Ch],eax 00380a9f 8b45d4 mov eax,dword ptr [ebp-2Ch] 00380aa2 8945dc mov dword ptr [ebp-24h],eax 00380aa5 33d2 xor edx,edx 00380aa7 8955f0 mov dword ptr [ebp-10h],edx 00380aaa 90 nop ...snipped...

Looking at the disassembled code, we are now able to conclude what exactly caused the null reference, and where we are in the function's code. Specifically, we crashed right before calling the TreeNode.get_Name() method, which is the getter for the TreeNode.Name property. The only thing that could have gone wrong immediately before the call is that the TreeNode object was null (indeed, the cmp instruction we see is there for the sole reason of making sure the receiver of the call is not null ). Furthermore, we know that the result of the TreeNode.get_Name() method call is then transferred into the ECX register and passed to the Directory.GetFiles method. This should be enough to identify the offending line of code in the source file:

ThreadPool.QueueUserWorkItem(_ = > { foreach ( string file in Directory.GetFiles(node.Name)) { listBox1.Items.Add(Path.GetFileName(file)); } });

Determine Function Arguments

Another thing that might happen to you often is that you have a crash dump or a live debuggee, but are unable to retrieve function arguments from the stack. There are many commands that attempt to do so -- !CLRStack -p is the managed option, kb attempts to do the job for unmanaged frames, and the excellent SOSEX extension offers the !mk command. Nonetheless, because of the variety of x86 calling conventions, and especially because the JIT uses a custom fastcall-resembling calling convention, at times neither of these commands will actually work.

For example, consider the following call stack, in which your thread is clearly waiting for a .NET monitor, in the Monitor.Enter call:

0:000> !CLRStack OS Thread Id: 0x2a88 (0) ESP EIP 0037e8a8 76f2013d [GCFrame: 0037e8a8] 0037e978 76f2013d [HelperMethodFrame_1OBJ: 0037e978] System.Threading.Monitor.Enter(System.Object) 0037e9d0 003f0b68 FileExplorer.MainForm.listBox1_DoubleClick(System.Object, System.EventArgs) 0037ea34 5933407c System.Windows.Forms.Control.OnDoubleClick(System.EventArgs) 0037ea4c 59666146 System.Windows.Forms.ListBox.WndProc(System.Windows.Forms.Message ByRef) 0037eaf8 58e086a0 System.Windows.Forms.Control+ControlNativeWindow.OnMessage (System.Windows.Forms.Message ByRef) 0037eb00 58e08621 System.Windows.Forms.Control+ControlNativeWindow.WndProc (System.Windows.Forms.Message ByRef) 0037eb14 58e084fa System.Windows.Forms.NativeWindow.Callback(IntPtr, Int32, IntPtr, IntPtr) 0037ecb8 007c09e4 [NDirectMethodFrameStandalone: 0037ecb8] System.Windows.Forms.UnsafeNativeMethods.DispatchMessageW(MSG ByRef) 0037ecc8 58e18cee System.Windows.Forms.Application+ComponentManager. System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32, Int32, Int32) 0037ed64 58e18957 System.Windows.Forms.Application+ThreadContext. RunMessageLoopInner(Int32, System.Windows.Forms.ApplicationContext) 0037edb8 58e187a1 System.Windows.Forms.Application+ThreadContext. RunMessageLoop(Int32, System.Windows.Forms.ApplicationContext) 0037ede8 58dd5911 System.Windows.Forms.Application.Run(System.Windows.Forms.Form) 0037edfc 003f00ae FileExplorer.Program.Main() 0037f020 727b1b4c [GCFrame: 0037f020]

Well, one obvious thing to find out is which synchronization object your thread is locking, i.e., what was the argument passed to the Monitor.Enter method. Trying !CLRStack -a does not help:

0:000> !clrstack -a OS Thread Id: 0x2a88 (0) ESP EIP 0037e8a8 76f2013d [GCFrame: 0037e8a8] 0037e978 76f2013d [HelperMethodFrame_1OBJ: 0037e978] System.Threading.Monitor.Enter(System.Object) 0037e9d0 003f0b68 FileExplorer.MainForm.listBox1_DoubleClick(System.Object, System.EventArgs) PARAMETERS: this = 0x02708308 sender = 0x0271c4d4 e = 0x02c8f400 LOCALS: 0x0037e9f4 = 0x02c8f470 0x0037e9f0 = 0x02c8f4e8 0x0037ea00 = 0x00000001 0x0037e9ec = 0x02708990 0x0037e9e8 = 0x027089b4 ...snipped...

As you see, SOS was not able to report the argument to Monitor.Enter . Perhaps the unmanaged call stack will help?

0:000> kb ChildEBP RetAddr Args to Child 0037e4ac 76600bdd 00000002 0037e4fc 00000001 ntdll!ZwWaitForMultipleObjects+0x15 0037e548 75541a2c 0037e4fc 0037e570 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100 0037e590 7545086a 00000002 7efde000 00000000 KERNEL32!WaitForMultipleObjectsExImplementation+0xe0 0037e5e4 764b2bf1 00000054 004d61e8 ffffffff USER32!RealMsgWaitForMultipleObjectsEx+0x14d 0037e610 764a202d 004d61e8 ffffffff 0037e638 ole32!CCliModalLoop::BlockFn+0xa1 0037e690 7285d245 00000002 ffffffff 00000001 ole32!CoWaitForMultipleHandles+0xcd 0037e6b0 7285d1a6 00000000 ffffffff 00000001 mscorwks!NT5WaitRoutine+0x51 0037e71c 7285d10a 00000001 004d61e8 00000000 mscorwks!MsgWaitHelper+0xa5 0037e73c 729142c8 00000001 004d61e8 00000000 mscorwks!Thread::DoAppropriateAptStateWait+0x28 0037e7c0 7291435d 00000001 004d61e8 00000000 mscorwks!Thread::DoAppropriateWaitWorker+0x13c 0037e810 729144e1 00000001 004d61e8 00000000 mscorwks!Thread::DoAppropriateWait+0x40 0037e86c 727b5422 ffffffff 00000001 00000000 mscorwks!CLREvent::WaitEx+0xf7 0037e880 728e98e2 ffffffff 00000001 00000000 mscorwks!CLREvent::Wait+0x17 0037e90c 729136e0 00497728 ffffffff 00497728 mscorwks!AwareLock::EnterEpilog+0x8c 0037e928 72913664 e6620e7e 0037ea18 02708308 mscorwks!AwareLock::Enter+0x61 0037e9c8 003f0b68 02c8f4e8 02c8f4c8 02c8f470 mscorwks!JIT_MonEnterWorker_Portable+0xb3 WARNING: Frame IP not in any known module. Following frames may be wrong. 0037ea28 5933407c 02c8f400 0271c6bc 0271c4d4 0x3f0b68 0037ea44 59666146 00000000 00000000 0037ea84 System_Windows_Forms_ni+0x72407c ...snipped...

Notice that the JIT_MonEnterWorker_Portable frame corresponds to the Monitor.Enter method call. How do I know this? By inspecting the return address: the unmanaged frame's return address is 003f0b68 , which is also the EIP value for the listBox1_DoubleClick method in the managed stack trace.

Now we can expect to find the first three arguments passed to Monitor.Enter displayed in the unmanaged stack trace. Unfortunately, kb reports correct argument information only when the arguments are passed through the stack -- it does not distinguish between the standard C and Win32 calling conventions, and the custom calling conventions used by the CLR JIT. In fact, in this case, if we were to continue down that path, we might have diagnosed the problem incorrectly!

Where do we find the argument, then? There's hardly anything left but to inspect the disassembly of the calling method and try to determine how the argument is passed to Monitor.Enter :

0:000> !u 0x3f0b68 Normal JIT generated code FileExplorer.MainForm.listBox1_DoubleClick(System.Object, System.EventArgs) Begin 003f0a10, size 1ca ...snipped... 003f0b3b 8b55cc mov edx,dword ptr [ebp-34h] 003f0b3e 8b4dc8 mov ecx,dword ptr [ebp-38h] 003f0b41 3909 cmp dword ptr [ecx],ecx 003f0b43 e8f8cff471 call mscorlib_ni+0x68db40 (7233db40) (System.Threading.Thread.Start(System.Object), mdToken: 060012b3) 003f0b48 90 nop 003f0b49 b9c8000000 mov ecx,0C8h 003f0b4e e82d87a971 call mscorlib_ni+0x1d9280 (71e89280) (System.Threading.Thread.Sleep(Int32), mdToken: 060012d6) 003f0b53 90 nop 003f0b54 8b45d0 mov eax,dword ptr [ebp-30h] 003f0b57 8b8050010000 mov eax,dword ptr [eax+150h] 003f0b5d 8945c0 mov dword ptr [ebp-40h],eax 003f0b60 8b4dc0 mov ecx,dword ptr [ebp-40h] 003f0b63 e83d203c72 call mscorwks!JIT_MonEnterWorker (727b2ba5) >>> 003f0b68 90 nop 003f0b69 90 nop 003f0b6a 8b4dc8 mov ecx,dword ptr [ebp-38h] 003f0b6d 3909 cmp dword ptr [ecx],ecx 003f0b6f e8dccdf471 call mscorlib_ni+0x68d950 (7233d950) (System.Threading.Thread.Join(), mdToken: 060012d1) ...snipped...

Somewhere in the marked five lines, we have the argument passing process, but it does not go through the stack. Note that there are only two registers used -- EAX and ECX , and they are both initialized to the same value (found at the address EBP-40h ). Excellent -- all that's left is to obtain the value of either of these registers, and we're done!

...Not so fast, though. x86 registers are scarce, and are very likely to be reused across function calls. It stands to reason that both registers have been overwritten with other values, making it impossible to find what they contained previously. Indeed, their current values don't make sense:

0:000> r eax eax=00000054 0:000> r ecx ecx=00000000

Fortunately, we have EBP to the rescue! Recall that to reconstruct the stack earlier, we had access to the entire EBP chain that connects all the frames on the stack. This means we always have the EBP value for any frame, and the k command conveniently reports it for us:

0:000> k ChildEBP RetAddr 0037e4ac 76600bdd ntdll!ZwWaitForMultipleObjects+0x15 0037e548 75541a2c KERNELBASE!WaitForMultipleObjectsEx+0x100 0037e590 7545086a KERNEL32!WaitForMultipleObjectsExImplementation+0xe0 0037e5e4 764b2bf1 USER32!RealMsgWaitForMultipleObjectsEx+0x14d 0037e610 764a202d ole32!CCliModalLoop::BlockFn+0xa1 0037e690 7285d245 ole32!CoWaitForMultipleHandles+0xcd 0037e6b0 7285d1a6 mscorwks!NT5WaitRoutine+0x51 0037e71c 7285d10a mscorwks!MsgWaitHelper+0xa5 0037e73c 729142c8 mscorwks!Thread::DoAppropriateAptStateWait+0x28 0037e7c0 7291435d mscorwks!Thread::DoAppropriateWaitWorker+0x13c 0037e810 729144e1 mscorwks!Thread::DoAppropriateWait+0x40 0037e86c 727b5422 mscorwks!CLREvent::WaitEx+0xf7 0037e880 728e98e2 mscorwks!CLREvent::Wait+0x17 0037e90c 729136e0 mscorwks!AwareLock::EnterEpilog+0x8c 0037e928 72913664 mscorwks!AwareLock::Enter+0x61 0037e9c8 003f0b68 mscorwks!JIT_MonEnterWorker_Portable+0xb3 WARNING: Frame IP not in any known module. Following frames may be wrong. 0037ea28 5933407c 0x3f0b68 0037ea44 59666146 System_Windows_Forms_ni+0x72407c 0037eaf0 58e086a0 System_Windows_Forms_ni+0xa56146 0037eaf8 58e08621 System_Windows_Forms_ni+0x1f86a0

Life is easy now. All we need to do is subtract 0x40 from this value and find the argument passed to Monitor.Enter at that address:

0:000> dd 0037ea28-40 L1 0037e9e8 027089b4 0:000> !do -nofields 027089b4 Name: System.String MethodTable: 71f20b70 EEClass: 71cdd66c Size: 44(0x2c) bytes (C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll) String: SecondaryLock

Now we have something that definitely looks like an object, and we can verify that this object is indeed used for synchronization by inspecting the process' sync blocks with the !SyncBlk command:

0:000> !syncblk Index SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner 16 004d61a4 3 1 00497728 2a88 0 02708990 System.String 17 004d61d4 3 1 004eb4d8 2504 5 027089b4 System.String ...snipped...

And there it is: not only we find the object our thread is waiting for, but we also have its owning thread, which allows further reconstruction of the application's wait chain.

Notice that the approach above worked because the method calling Monitor.Enter passed to it a local variable, which was available on the stack. A similar approach would work for a method argument, but there might be more complex cases in which the argument would not be readily available on the calling method's stack frame. Still, armed with the knowledge we have about Monitor.Enter , namely the fact it receives its argument through the ECX register, we can inspect the disassembly of Monitor.Enter :

0:000> u mscorwks!JIT_MonEnterWorker_Portable mscorwks!JIT_MonEnterWorker_Portable: 729135bc 6a7c push 7Ch 729135be b8cc13ca72 mov eax,offset mscorwks! ?? ::FNODOBFM::`string'+0x1ccc4 (72ca13cc) 729135c3 e8c1eae9ff call mscorwks!_EH_prolog3_catch (727b2089) 729135c8 894dec mov dword ptr [ebp-14h],ecx 729135cb 33db xor ebx,ebx 729135cd 8d8d78ffffff lea ecx,[ebp-88h] ...snipped...

Very early on, Monitor.Enter stores the parameter on the stack (this is often called "parameter spilling"), and we can expect to be able to retrieve it from there. Indeed, the EBP value for the JIT_MonEnterWorker_Portable frame was 0037e9c8 , and the argument address is at offset -0x14 from that location:

0:000> dd 0037e9c8-14 L1 0037e9b4 027089b4

Find the Static Root that References Your Object

A typical memory leak analysis session conducted using SOS involves identifying a bunch of objects that are being leaked (not freed) and then identifying the chain of references from some GC root that points to them. This is a fairly tedious process (profilers are much better at this), and it's even worse because at times the actual root information would not be available. One such case is when the root is a static variable.

A typical root reference chain for a managed object that is retained by a static GC root would have a pinned object array appear as the rooted object. Below is a typical reference chain. (Note that I am using an x64 example here -- it makes the memory search stage more interesting, and also gives some heterogeneity to the examples.)

0:010> !gcroot 0000000002bcaf58 ...snipped... DOMAIN(0000000000C1C5F0):HANDLE(Pinned):5017f8:Root:0000000012761018(System.Object[])-> 00000000039b3c30(System.EventHandler)-> 0000000002bcab38(System.Object[])-> 0000000002bcf8d8(System.EventHandler)-> 0000000002bcaf58(FileExplorer.MainForm+FileInformation)

This object array is ubiquitous, it would seem that all static root references stem from it. Indeed (and this is a CLR implementation detail), static fields are stored in this array and their retention as far as the GC is concerned is through it. This also makes it difficult to determine which static field of which class is responsible for the static reference. For example, in the reference chain above, it is apparent that there is a static EventHandler -typed field (which is likely an event) that retains the FileInformation instance -- but it's very desirable to find the details of that static field.

More than six years ago, Doug Stewart wrote a short blog post outlining the general process in cases like these. This process generally works, but requires some adaptation in the 64-bit era, so here goes. First, let's take a look at that rooted array:

0:010> !do 0000000012761018 Name: System.Object[] MethodTable: 000007fef68858f8 EEClass: 000007fef649eb78 Size: 8192(0x2000) bytes Array: Rank 1, Number of elements 1020, Type CLASS Element Type: System.Object Fields: None

OK, so it's an array with 1020 elements, and one of these elements must be our event handler. Is it the case? Let's search its memory and make sure:

0:010> s -q 0000000012761018 L2000 00000000039b3c30 00000000`12762e10 00000000`039b3c30 00000000`0278b380

Sure enough, our event handler is one of the array elements, at the address 00000000`12762e10 . Now there are two key observations:

The EventHandler instance ended up in the array somehow. Maybe if we can find other references to this array address, we can find who put it there and then determine whose static field it is. There is a reference from that EventHandler instance to one of our application's objects (eventually). Then there should be additional references to this array address, which shape the chain of references to our application's object.

Frankly, both of these are long shots, because it might be the case that the address is calculated dynamically, but let's give it a spin. Doug's original guidance at this point is to launch a memory search for any references to the array location, which would complete in a few seconds for a 32-bit address space; not so much for a 64-bit address space!

However, we are looking for references in managed code only, so no need to traverse the entire address space. It suffices to look at the address ranges of modules in the current AppDomain:

0:010> !dumpdomain ...snipped... -------------------------------------- Domain 1: 0000000000c1c5f0 LowFrequencyHeap: 0000000000c1c638 HighFrequencyHeap: 0000000000c1c6c8 StubHeap: 0000000000c1c758 Stage: OPEN SecurityDescriptor: 0000000000c1de90 Name: FileExplorer.exe Assembly: 0000000000c3cd80 [C:\Windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll] ClassLoader: 0000000000c3ce40 SecurityDescriptor: 0000000000c3cc40 Module Name 000007fef6461000 C:\Windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll 000007ff000f2568 C:\Windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\sortkey.nlp 000007ff000f2020 C:\Windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\sorttbls.nlp Assembly: 0000000000c57480 [D:\courses\NET Debugging\Exercises\4_MemoryLeak\Binaries\FileExplorer.exe] ClassLoader: 0000000000c57540 SecurityDescriptor: 0000000000c57390 Module Name 000007ff000433d0 D:\courses\NET Debugging\Exercises\4_MemoryLeak\Binaries\FileExplorer.exe ...many more of these guys...

Now we have a couple of module addresses and can constrain our memory search. It seems safe to start at 7ff`00000000 and go through a few hundred megabytes looking for our address. Generally speaking, the proper WinDbg command here would be:

0:010> s -q 000007ff`00000000 L?00000000`40000000 00000000`12762e10

(...recall that we are looking for a full QWORD .) The problem is that we might miss unaligned references to that address, which may occur if it is hard-coded into some instruction (e.g. a MOV ). So instead, we should be looking for the individual byte sequence, and remember that we are on a little endian architecture:

0:010> s -b 000007ff`00000000 L?00000000`40000000 10 2e 76 12 000007ff`001913d3 10 2e 76 12 00 00 00 00-48 8b 00 48 89 44 24 60 ..v.....H..H.D$` 000007ff`00191440 10 2e 76 12 00 00 00 00-48 8b d0 e8 60 c1 87 f7 ..v.....H...`...

Voila! Two references to the array location, and now let's take a look at them with the !U command to see if they are code:

0:010> !u 000007ff`001913d3 Normal JIT generated code FileExplorer.MainForm+FileInformation..ctor(System.String) Begin 000007ff001912d0, size 18d ...snipped... 000007ff`001913d0 90 nop 000007ff`001913d1 48b8102e761200000000 mov rax,12762E10h ...snipped... 000007ff`0019143e 48b9102e761200000000 mov rcx,12762E10h 000007ff`00191448 488bd0 mov rdx,rax ...snipped...

They are both a match inside FileInformation 's constructor, which gives us an excellent clue where to look. Indeed, here's the source code showing the event registration sequence:

public FileInformation( string fullPath) { Path = fullPath; Name = System.IO.Path.GetFileName(Path); FirstFewLines = File.ReadAllLines(Path).Take( 100 ).ToArray(); FileInformationNeedsRefresh += FileInformation_FileInformationNeedsRefresh; }

Conclusion

Hopefully, you are now more convinced that basic assembly reading skills, understanding of calling conventions, and familiarity with the stack structure can provide actual benefits when debugging your .NET applications or analyzing crash dumps.

Assembly reading skills do not come automatically; you must practice them frequently. The best approach would be to compile a set of examples similar to the above and go through them periodically. If the agile guys are advocating code katas to practice TDD, why can't we have disassembly katas to practice our assembly reading skills?

Further Reading