SIGSEGV as control flow - How the JVM optimizes your null checks

If you’ve ever written Java, you’ve almost certainly written null checks. For better or for worse, if (variable == null) shows up everwhere - Hadoop alone has over 6000 of them 1. In many cases, these are purely defensive - a null isn’t really expected to be passed in the normal flow of the code. In this post I’m going to run through a fun little trick that the JVM uses to optimizes such cases.

To see this in practice, I’ll look at a slightly smarter version of the code in my last post. (Which you should read if you are interested in looking at assembly output on your own machine).

import java.util.Random ; public class Test { static Random random = new Random (); public static int getLen ( String s ) { if ( s == null ) { return - 1 ; } else { return s . length (); } } public static void main ( String [] args ) { long res = 0 ; for ( int i = 0 ; i < 50000000 ; i ++) { res += getLen ( Integer . toString ( random . nextInt ( 1000 ))); } res += getLen ( null ); System . out . println ( res ); } }

We’ve decided to be a little smarter and cover the null case. How does that effect the assembly that hotspot generates? Lets find out:

jackson@serv nullcheck $ java \ -XX:+UnlockDiagnosticVMOptions -XX:CompileCommand=print,Test.getLen \ -XX:-UseCompressedOops Test > asm.s

1. CompressedOops forced off to clean up the resulting assembly

Interestingly, the optimized (c2) version looks more or less identical to the previous one:

# {method} 'getLen' '(Ljava/lang/String;)I' in 'Test' # parm0: rsi:rsi = 'java/lang/String' # [sp+0x30] (sp of caller) mov %eax ,- 0x14000 ( %rsp ) push %rbp sub $0x20 , %rsp #*synchronization entry # - Test::getLen@-1 (line 9) mov 0x10 ( %rsi ), %r10 #*getfield value # - java.lang.String::length@1 (line 623) # - Test::getLen@7 (line 12) # implicit exception: dispatches to 0x00007ff0f48275b1 mov 0x10 ( %r10 ), %eax # implicit exception: dispatches to 0x00007ff0f48275a0 add $0x20 , %rsp pop %rbp test %eax , 0xa5b1a61 ( %rip ) # 0x00007ff0fedd9000 # {poll_return} retq

What happened to our null check? Hotspot can’t magically prove that I never call this with null - After all there is a call passing null right there in our main function! The hint is the implicit exception: dispatches to 0x00007eff39212231 - telling us where execution will resume in the event of a signalled exception (in this case a SIGSEGV). Instead of actively checking for null, it just lets the code segfault and recovers from there - a cute, but potentially slow, trick. Lets briefly go through how it works.

Diving in

Hotspot’s linux x86 signal handler, in os_linux_86.cpp, looks something like this - I’ve cut out the relevant sections for brevity:

extern "C" JNIEXPORT int JVM_handle_linux_signal ( int sig , siginfo_t * info , void * ucVoid , int abort_if_unrecognized ) { ucontext_t * uc = ( ucontext_t * ) ucVoid ; ... pc = ( address ) os :: Linux :: ucontext_get_pc ( uc ); ... } else if ( sig == SIGSEGV && ! MacroAssembler :: needs_explicit_null_check (( intptr_t ) info -> si_addr )) { // Determination of interpreter/vtable stub/compiled code null exception stub = SharedRuntime :: continuation_for_implicit_exception ( thread , pc , SharedRuntime :: IMPLICIT_NULL ); } ... if ( stub != NULL ) { // save all thread context in case we need to restore it if ( thread != NULL ) thread -> set_saved_exception_pc ( pc ); uc -> uc_mcontext . gregs [ REG_PC ] = ( greg_t ) stub ; return true ; }

The handler makes use of the mcontext_t struct, which contains the register state at the point of the segfault, to find the address of the fault. SharedRuntime::continuation_for_implicit_exception then finds the metadata about the function that was running, and checks its exception table for an entry for the given address. Knowing where to resume execution, the handler then sets the instruction pointer in mcontext and returns.

Back in compiled code, our function will call what’s known as an Uncommon Trap, which is JVM-speak for a jump back in to the intepreter, which will invalidate our compiled function (since our assumption about s never being null is apparently wrong), so it can be run via the interpreter and eventually recompiled.

For those curious, the actual code that generates this case is PhaseCFG::implicit_null_check in lcm.cpp which looks at a given candidate for null check elimination. Unfortunately it is quite gnarly, as is most of the c2 compiler code. It is worth noting, though, that the JIT will not attempt this trick if the null branch was profiled to be taken more than 0.01% of the time.

In action

We can observe this happening in our example using everyone’s favorite linux debugging tool strace. For those less familiar, running a program with strace (or attaching to an existing one with -p <pid> ) lets us observe all system calls the program makes and signals it receives. To run our example via strace, we can do something like the following:

strace -f -o outfile java Test

1. -f tells strace to trace child processes, which java makes. This took me an unreasonably long time to figure out

Sure enough, you’ll see a SIGSEGV. Possibly several, in fact - Hotspot uses segfault as part of normal execution for several things (I recommend reading through the full signal handler if you are curious). The formatting seems to depend on your distro, but Ubuntu will helpfully tell us si_addr , or the addressed that was being accessed that caused the fault.

25048 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x10} ---

Faulted trying to access 0x10 - the offset in the string we were trying to read from :)

(Shameless plug: If you are, like me, crazy enough to find this sort of stuff interesting, follow me on twitter. I’m going to try and do several more posts on JVM internals in the next few months.)