Just a reminder: Release builds are not Debug builds. Seems obvious, but it's worth saying again. Release builds are optimized for speed and debug builds are optimized for, well, debug-ability. However, most of the optimizations in managed code are done by the JIT compiler rather than the language compiler.

However, sometimes in our zealous attempts to make Release builds easier to debug we can inadvertently (or totally "vertently") remove a lot of the optimizations that make Release builds faster.

Introduction

I was talking to Jeremy and he had come upon some code that was inserting a try{} catch() { throw; } block in every method. The rationale was that "they wanted the complete stack trace in their logging after an exception was thrown." That's a noble goal, but their technique of adding these guards is what's called an "anti-pattern" or as Egon would say "it would be bad." An anti-pattern is "a pattern that tells how to go from a problem to a bad solution."

First, let's start by exploring why they aren't getting "a complete stack trace" before we tackle why they believe they want/need a complete one.

Here's a simple program:

using System; class NormalProgram { static void Main( string [] args) { try { methodA(); } catch (System.Exception e) { Console.WriteLine(e.ToString()); } } static void methodA() { methodB(); } static void methodB() { methodC(); } static void methodC() { badMethod(); } static void badMethod() { throw new ApplicationException( "generic bad thing" ); } }

In this program, Main calls A, then A calls B, which calls C, which calls badMethod which throws an exception. That exception is caught up at Main. So far, makes sense right?

We'll compile two versions, one in Debug mode and one in Release and run them on a regular 32-bit machine. Notice we're saying /o for enable optimizations for the Release build. We are making debug symbols for Release via /debug:pdbonly, but it doesn't affect performance in a significant way.

"%FXROOT%\csc.exe" /t:exe /out:NormalRelease.exe /debug:pdbonly /o NormalProgram.cs

"%FXROOT%\csc.exe" /t:exe /out:NormalDebug.exe /debug NormalProgram.cs

We run Debug on 32-bit and we see:

System.ApplicationException: generic bad thing

at NormalProgram.badMethod() in NormalProgram.cs:line 24

at NormalProgram.methodC() in NormalProgram.cs:line 23

at NormalProgram.methodB() in NormalProgram.cs:line 21

at NormalProgram.methodA() in NormalProgram.cs:line 19

at NormalProgram.Main(String[] args) in NormalProgram.cs:line 11

Looks good, makes sense. Now we run Release on 32-bit:

System.ApplicationException: generic bad thing

at NormalProgram.badMethod() in NormalProgram.cs:line 24

at NormalProgram.Main(String[] args) in NormalProgram.cs:line 11

Our call stack has been "collapsed"? Does it make it harder to debug? Not really, because we knew where we ended up and the path between the two is clear (otherwise this "inlining" wouldn't have happened.)

However, it is confusing and there's a perception that "information is missing." It's important to note that information isn't missing but rather this stack trace is showing the runtime reality. What you write (the Programmer's Intent) isn't exactly what runs, especially when things are optimized. As programmers we are doing our best to tell the machine what to do and it is doing its best to do that fast and correct - let it do what it does best. When we see a stack like this our first reaction is that information is there, but has been hidden. In fact, the code we wrote ran, but not every method got their own local stack frame as they were optimized into fewer methods.

Inlining

Disclaimer: I'm showing you this for informational purposes. Doing this without thinking or having a good reason is usually a bad idea and could confuse other issues. The only reason you might want to do this would be if you were doing some kind of funky thing that made your method require its own stack space. DON'T ADD THIS ATTRIBUTE WILLY-NILLY. Don't Program By Coincidence. Remember the point of this post is that Release and Debug are different. Use them differently.

What if I tell the runtime JIT-ter to not inline? I'll add one line and one using:

using System; using System.Runtime.CompilerServices; class NormalProgram { static void Main( string [] args) { try { methodA(); } catch (System.Exception e) { Console.WriteLine(e.ToString()); } } [MethodImpl(MethodImplOptions.NoInlining)] static void methodA() { methodB(); } static void methodB() { methodC(); } static void methodC() { badMethod(); } static void badMethod() { throw new ApplicationException( "generic bad thing" ); } }

And run on 32-bit:

System.ApplicationException: generic bad thing

at NormalProgram.badMethod() in NormalProgram.cs:line 24

at NormalProgram.methodA() in NormalProgram.cs:line 19

at NormalProgram.Main(String[] args) in NormalProgram.cs:line 11

Well, that stopped the inlining of method A, but of course B and C are still inlined. You can see how this is going to get icky.

ASIDE: We as programmers have all reached this moment. This is the moment when we decide whether or not to slap an attribute on every method in our program. (This attribute is just an example, but you know this moment...there's still time to actually stop and try to understand the problem...)

...but we have to ship. So, let's try this:

using System; using System.Runtime.CompilerServices; class NormalProgram { [MethodImpl(MethodImplOptions.NoInlining)] static void Main( string [] args) { try { methodA(); } catch (System.Exception e) { Console.WriteLine(e.ToString()); } } [MethodImpl(MethodImplOptions.NoInlining)] static void methodA() { methodB(); } [MethodImpl(MethodImplOptions.NoInlining)]

static void methodB() { methodC(); } [MethodImpl(MethodImplOptions.NoInlining)] static void methodC() { badMethod(); } static void badMethod() { throw new ApplicationException( "generic bad thing" ); } }

And compile and run as Release on 32-bit...

System.ApplicationException: generic bad thing

at NormalProgram.badMethod() in NormalProgram.cs:line 24

at NormalProgram.methodC() in NormalProgram.cs:line 23

at NormalProgram.methodB() in NormalProgram.cs:line 21

at NormalProgram.methodA() in NormalProgram.cs:line 19

at NormalProgram.Main(String[] args) in NormalProgram.cs:line 11

Now we're back where we were before, right? Well, not really. We've got the full stack again, but we've neutered the build such that the JITter can't really do it's job.

Plus, we haven't tried 64-bit yet. Let's run the Release build on 64-bit:

System.ApplicationException: generic bad thing

at NormalProgram.methodC() in NormalProgram.cs:line 23

at NormalProgram.Main(String[] args) in NormalProgram.cs:line 11

Weird. One conclusion to draw would be that 64-bit CLR ignores method inlining attributes. That's not necessarily the case. Another would be that 64-CLR is more aggressive about inlining. The deal is that there are MANY optimizations that the JITter can to. MANY. They wouldn't give me a number because it's Special Sauce, but le's just say it's not one or two. It's more than that. ;)

Inlining isn't the only thing that can "collapse" a call stack like this. This also isn't exactly true. In this case, remember that x64 is a whole new processor architecture while x86 has been around for, let's say, a while. Turns out that x64 really likes to optimize and this is an example of a "Tail Call Optimization" rather than explicit Method Inlining.

Be sure to read David Broman's detailed blog post on exactly x64 tail-call optimizes and when it doesn't.

A good explanation from The Original Wiki at the Portland Pattern Wiki:

"Tail-call optimization (or tail-call merging or tail-call elimination) is a generalization of TailRecursion: If the last thing a routine does before it returns is call another routine, rather than doing a jump-and-add-stack-frame immediately followed by a pop-stack-frame-and-return-to-caller, it should be safe to simply jump to the start of the second routine, letting it re-use the first routine's stack frame (environment)...." "However, TailCallOptimization has some drawbacks. The runtime environment will have a confusing stack during execution of tail-called routines, which can make debugging difficult ("How did I get here? baz() never even calls foo()!")."

The wiki explanation was written with C in mind, but the concept is universal and it's the "confusing stack during execution of tail-called routines" that's got me writing this blog post today.

There's some controversy on whether Tail Call Optimizations are a subset of Method Inlining. That may never be resolved, but for now they are discrete and different. Eric Gunnerson has a good article on some of the heuristics that the JIT uses in its inlining decision making process.

For some hardcore details, be sure to read David Notario's excellent series on x86 JIT inlining from 2004, both Part 1 and Part 2. Perhaps start by reading his "The CLR x86 JIT, an overview" as he explains seven basic stages the x86JIT.

An Anti-Pattern

Now, back to our customer from the beginning who was inserting a try{} catch() { throw; } block in every method. Let's try that (don't do this at home):

using System; class UglyProgram { static void Main( string [] args) { try { methodA(); } catch (Exception e) { Console.WriteLine(e); } } static void methodA() { try { methodB(); } catch { throw ; } } static void methodB() { try { methodC(); } catch { throw ; } } static void methodC() { try { badMethod(); } catch { throw ; } } static void badMethod(){ throw new ApplicationException( "generic bad thing" );} }

...and compile both Debug and Release and this produces the same output on both 32-bit and 64-bit:

System.ApplicationException: generic bad thing

at UglyProgram.badMethod() in UglyProgram.cs:line 20

at UglyProgram.methodC() in UglyProgram.cs:line 19

at UglyProgram.methodB() in UglyProgram.cs:line 18

at UglyProgram.methodA() in UglyProgram.cs:line 17

at UglyProgram.Main(String[] args) in UglyProgram.cs:line 10

Why? Because by setting up the try 's you're introducing the potential for other ways out of each method so the JIT'ter can't optimize anything that would collapse the call stack like we saw earlier. This isn't a good pattern as it doesn't deal with the underlying issue in a simple way. The customer wants to be able to debug Release builds easier.

Options

Here's some options we could try:

Try/catch every method - Not a good idea. Not only are you cluttering your code, but you're paying a cost all the time when you only need to debug in Production/Release occasionally.

- Not a good idea. Not only are you cluttering your code, but you're paying a cost all the time when you only need to debug in Production/Release occasionally. Put [MethodImpl(MethodImplOptions.NoInlining)] on every method - Also not a good idea, not only because of the operative "every method" violating the DRY Principle but also because it only deals with one specific optimization out of many.

- Also not a good idea, not only because of the operative "every method" violating the DRY Principle but also because it only deals with one specific optimization out of many. Turn off Optimizations in the compiler - This particular kind of inlining appears to be a JITter thing, not a language compiler thing so turns out that doesn't work in this instance.

Ok, so the customer wants a fuller call stack. I would propose that they in fact don't want this. One of my old bosses used to say:

"Customer walks in with a cell phone and says 'this thing needs a bigger antenna.' We have to ask ourselves does he want a bigger antenna or better cell phone reception" - Mark Klein

This is a great pithy analogy. Everyone comes in with both Problems and Solutions. This customer wants to be able to debug Release builds. They surely don't want full call stacks in Release Mode if it means turning off all optimizations.

Best Solution

One thing the customer could do that would give them the best of both worlds is something Mark Pearce told me about a while back, the [.NET Framework Debugging Control] section of an .ini file you've probably never used:

This JIT configuration has two aspects: You can request the JIT-compiler to generate tracking information. This makes it possible for the debugger to match up a chain of MSIL with its machine code counterpart, and to track where local variables and function arguments are stored.

You can request the JIT-compiler to not optimize the resulting machine code.

If you have a file Foo.exe you can create a Foo.ini with these contents:

[.NET Framework Debugging Control]

GenerateTrackingInfo=1

AllowOptimize=0

Implementation Detail: Why is it an INI file? Well, the .NET runtime hasn't started up yet, so we don't have all that XML-parsey goodness lying around. Instead they need to use the Win32 method GetPrivateProfileString() in order to retrieve those.

This solution assumes you compiled with /debug:pdbonly in Release mode in order to generate the full stack when needed.

DICSLAIMER: This isn't something you want to put into production! You'd only use this for debugging. It's something you'd enable on a case-by-case basis.