Update: some colleagues and ex-colleagues from the .NET framework team showed up on HN to comment about this issue. It’s worth reading.

Prelude

C# supports kind-of-macros via the very neat Expression Tree API. The gist is:

You build a tree that represents some C# expression.

When you want to execute that expression, C# basically treats it like a small DLL. i.e., it compiles the expression tree to a IL (the CLR’s bytecode), and then links to it like it would any other DLL.



Neat, eh? Take a look at this code sample.

var exp = ( Expression < Func < int , int >>)( x => x + 2 ); var result = exp . Compile (). Invoke ( 1 );

In the first line, we generate an Expression object that represents the lambda (x => x+2) .

In the second line, we compile this expression to IL and then call Invoke, which will link to the new IL “library” and run that code. THe code should produce 3 as a result.

The problem

Today, I built, compiled, and attempted to run a seemingly normal Expression object.

I was greeted with the following mysterious error, buried many method calls deep into the System.Linq.Expressions namespace, part of the core .NET code:

Instance field 'Value' is not defined for type 'EETypeRva:0x01588388(System.Linq.Expressions.Interpreter.ExplicitBox)'"

This exception was totally undocumented in the API. Normally this means (1) they were slacking off with documentation, or (2) there is a bug in the library, and this exception was never meant to be thrown.

In either case, I found the error to be completely baffling. I showed it to my awesome colleague Bart De Smet. He too was puzzled.

The investigation

We decided that it seemed to be a bug in the library.

Since the error happened in the System.Linq.Expressions.Interpreter assembly, we started by decompiling the whole DLL, using ILSpy.

From here, we had a look at the implementation of this System.Linq.Interpreter.ExplicitBox class that’s causing us problems. I’ve copied it below, highlighting the offending Value member for your benefit.

Notice in particular that it’s a property, yet the error above claims it’s a field.

Hmm. That’s suspicious!

We decided to look at where this member is supposed to be invoked. It turned out to be in System.Linq.Expressions.Interpreter.QuoteInstruction.ExpressionQuoter , specifically the VisitParameter method. I’ve decompiled it with ILSpy, and highlighted the offending attempt to access the Value property.

Look at that! It’s clearly trying to access a field called Value , but we know from the last code snippet that Value is unambiguously a property. So this code is completely wrong, and it’s been wrong since probably about 2008.

Further proof this has been a bug for a long time is that the Value property actually comes from the IStrongBox interface (i.e., ExplicitBox inherits from IStrongBox , which you can see in the first code snippet) — and since interfaces can’t have fields at all, we know that Value has always been a property.

QED. Bug found. We can all go home now, right?

Manually patching the DLL

Normal people would have just reported this to the .NET team and been done with it.

So, of course, that’s what we did, right? HAHAHAHAHA of course not. After Bart finished grumbling and accepted that I was not going to let him leave until we resolved the issue, we sat down and we fucking patched the binary by hand. Fuck the police.

We started by grabbing System.Linq.Expressions.dll (the DLL where all this code is) and decompiling it to IL using this command:

ildasm /out=System.Linq.Expressions.il System.Linq.Expressions.dll

This produced a big-ass IL file. We opened it and ctrl-F’d for the string "Value" (including quotes), which we imagined would lead us to the code that does this awful Expression.Field thing we saw above.

It takes us to line # 88,200:

IL_003c: ldstr "Value" IL_0041: call class System.Linq.Expressions.MemberExpression \ System.Linq.Expressions.Expression::Field( \ class System.Linq.Expressions.Expression, \ string)

Bingo. Now the real fun begins.

(Bart grumbled and moaned some more about as any sane person should have, but this was no time to be sane. This was a time to fix the problem.)

We basically knew how to do what we wanted in IL, but we ended up spending a lot of time compiling sample applications to get the namespaces and assemblies right, so that we could actually write the newly-hacked IL code. We also spent some time fixing errors in the IL code that ilasm reported to us.

At some point, though, we settled on the following IL code. We ultimately swapped out the lines above for this code. (NB, the IL addresses are optional.)

ldtoken method instance object \ [System.Runtime]System.Runtime.CompilerServices.IStrongBox::get_Value() call class [System.Reflection]System.Reflection.MethodBase \ [System.Reflection]System.Reflection.MethodBase::GetMethodFromHandle(valuetype [System.Runtime]System.RuntimeMethodHandle) castclass [System.Reflection]System.Reflection.MethodInfo IL_0041: call class System.Linq.Expressions.MemberExpression \ System.Linq.Expressions.Expression::Property( \ class System.Linq.Expressions.Expression, \ class [System.Reflection]System.Reflection.MethodInfo)

Finally, we re-generated the DLL from this IL with the following command:

ilasm /dll System.Linq.Expressions.il

We re-decompile this DLL with ILSpy to make sure that the everything looks like it should (notice the highlighted part points at a call to Property instead):

Great! That’s about what we’d expect.

So, finally, we recompiled our original project to, so that it fetched the new patched binary from the GAC (i.e., the Global Assembly Cache).

Then we ran our code, and magically, it worked.

We followed this up by sending the world’s best bug report to the .NET native team.

We retire for the night, winners.

Please enable JavaScript to view the comments powered by Disqus.

Disqus