How to analyse a BEAM core dump?

2016-01-26 by Rafal Studnicki

How to analyse a BEAM core dump?

TL;DR It’s relatively easy to debug Erlang system by connecting to the Erlang VM with gdb. To see an example, click here.

Introduction

Has this ever happened to you? You have your Erlang system running for a long time now and all of a sudden the BEAM process disappears on one of the nodes. In the OS logs you can see that the process was killed because of a segmentation fault, but besides that you have nothing. Obviously, you hadn’t been prepared for such a scenario and a core dump of the crashed process wasn’t generated. This has definitely happened to me (too many times), so now I don’t even start without enabling core dumps in the operating system, by setting ulimit -c unlimited . Now, after waiting (too long) for the segmentation fault to occur again, you ended up with a precious core.xxx file. But what’s next?

A story of one core dump

In this section I’m going to describe an example analysis of a core dump file, that was generated while load-testing MongooseIM. In fact that was the first core dump analysis I had ever done, and it was suprisingly successful (or it was just beginner’s luck). This is why no prior knowledge of gdb is necessary here. Still, you should understand how calling a function works (it’s using a stack), what pointers are and roughly how Erlang terms are built internally.

Starting gdb

Let’s assume that the generated core dump file was named core.1474 . gdb is started simply by pointing it at the BEAM executable that generated the core dump file and the core dump file itself. It’s also useful to pass the directory with the source of the VM (the same version), the C level output will then be more comprehensible:

gdb erts-6.3/bin/beam.smp -core core.14747 -d /home/azureuser/otp_src_17.4/erts/emulator

gdb should tell us what caused the crash and where in the code this has happened:

Program terminated with signal 11, Segmentation fault. #0 do_minor (nobj=2, objv=0x7f209849e5c0, new_sz=233, p=0x7f20a36010d0) at beam/erl_gc.c:1095 1095 val = *ptr;

As we can see, the segmentation fault happened in a piece of code that belongs to the garbage collector. gdb told us exactly in which function this happened, with what argument it was called and what line is reponsible. We can see clearly that the function tried to dereference an invalid pointer ( ptr ).

Where does the pointer point?

Considering we used only one command, we already know a lot. Now we just need to answer the question ‘what was the pointer and why was it corrupted’?

Let’s start with the most basic gdb command: backtrace -which prints the current stacktrace. A more chatty version backtrace full also exists and prints values of local variables for each function.

Here’s the output:

(gdb) backtrace #0 do_minor (nobj=2, objv=0x7f209849e5c0, new_sz=233, p=0x7f20a36010d0) at beam/erl_gc.c:1095 #1 minor_collection (recl=<synthetic pointer>, nobj=2, objv=0x7f209849e5c0, need=7, p=0x7f20a36010d0) at beam/erl_gc.c:876 #2 erts_garbage_collect (p=0x7f20a36010d0, need=7, objv=0x7f209849e5c0, nobj=2) at beam/erl_gc.c:450 #3 0x00000000005669bb in process_main () at x86_64-unknown-linux-gnu/opt/smp/beam_hot.h:47 #4 0x000000000049c4b2 in sched_thread_func (vesdp=0x7f209b560a80) at beam/erl_process.c:7743 #5 0x00000000005d5fc5 in thr_wrapper (vtwd=0x7fffa342fff0) at pthread/ethread.c:106 #6 0x00007f20a6ee1df3 in start_thread (arg=0x7f20863b0700) at pthread_create.c:308 #7 0x00007f20a6a071ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Unfortunately, this is not very helpful. On a positive note, we actually confirmed that the garbage collection has been triggered from inside the Erlang processes and then things went bad.

BEAM gdb macros

What if we could print a similar thing, but not from the BEAM, C code standpoint, but for the Erlang process that caused the crash? It turns out that it is possible, thanks to the set of macros that are available in the Erlang VM source code. They allow to dump stack, heap, message queue, inspect Erlang terms in memory and many more.

Inside the gdb session you load them like this:

(gdb) source /home/azureuser/otp_src_17.4/erts/etc/unix/etp-commands.in ... --------------- System Information --------------- OTP release: 17 ERTS version: 6.3 Compile date: Thu Jan 14 11:21:36 2016 Arch: x86_64-unknown-linux-gnu Endianess: Little Word size: 64-bit Halfword: no HiPE support: no SMP support: yes Thread support: yes Kernel poll: Supported but not used Debug compiled: no Lock checking: no Lock counting: no Node name: mongooseim@localhost Number of schedulers: 4 Number of async-threads: 10

When gdb prints some basic information about the Erlang VM and the architecture, it is a very good sign and from now on you can use all the macros from the file to analyze the core dump.

Let’s first try to find out what address was dereferenced, and by doing that learn another useful feature of gdb - you can access all local and global variables just like that. Remember what variable did the VM try to dereference and failed? Let’s try to print its value:

(gdb) print ptr $3 = <optimized out>

What a pity, it seems that we cannot access this particular variable since it was optimized out by the compiler.

But let’s take a look at what’s near by line 1095 here, maybe it is possible to workaround this issue and infer value of the variable manually.

In the C file we can see something like this:

case TAG_PRIMARY_BOXED: { ptr = boxed_val (gval); val = *ptr;

So, how about trying this:

(gdb) print gval $4 = 139760485281282

Much better. But what’s gval ? Let’s present its value to a hexadecimal number, which is 0x7F1C86142E02 . If you take a look at the boxed_val macro, you will see that it simply subtracts 2 from its argument. In this case the output (and therefore our manually deduced value of the ptr ) is 0x7F1C86142E00 . Here, the two least significant bits of the pointer (value: 2) exist as a tag for a boxed Erlang term, it might have been a tuple, a bignum or something else (at this moment we don’t need to know this), but definitely not a list.

Erlang process’ stacktrace

Let’s now try to find out more about what this ptr value means. Why don’t we start by fetching some information about the guilty process. ( p is a pointer to the Process struct, and it’s a local variable in the do_minor function, remember we can use them):

(gdb) etp-process-info p Pid: <0.2845.0> State: trapping-exit | garbage-collecting | running | active | prq-prio-normal | usr-prio-normal | act-prio-normal CP: #Cp<io_lib_format:collect/2+0x90> I: #Cp<erlang:port_info/1+0x78> Heap size: 233 Old-heap size: 376 Mbuf size: 0 Msgq len: 1 (inner=1, outer=0) Parent: <0.2759.0> Pointer: (Process *) 0x7f20a36010d0

This looks almost like a call to the erlang:process_info/1 BIF. We can also see the currently executed instruction and the continuation pointer. Let’s extend this knowledge by printing the full callstack:

(gdb) etp-stacktrace p % Stacktrace (27): #Cp<io_lib_format:collect/2+0x90>. #Cp<io_lib_format:collect/2+0x150>. #Cp<io_lib_format:fwrite/2+0x118>. #Cp<io_lib:format/2+0x80>. #Cp<wombat_plugin_code_tracer:handle_info/2+0x318>. #Cp<wombat_plugin:handle_info/2+0xc78>. #Cp<gen_server:try_dispatch/4+0xa0>. #Cp<gen_server:handle_msg/5+0x6b8>. #Cp<proc_lib:init_p_do_apply/3+0x58>. #Cp<terminate process normally>.

From this stacktrace we can tell that the guilty process was a gen_server process, that have callbacks implemented in the wombat_plugin module. Since it’s a generic module for all Wombat plugins, this particular module was implemented in the wombat_plugin_code_tracer module.

Have I forgotten to mention that the MongooseIM in question was monitored by WombatOAM?

Let’s also print the full stack (whole stackframes, not just functions):

(gdb) etp-stackdump p % Stackdump (27): #Cp<io_lib_format:collect/2+0x90>. #Cp<io_lib_format:collect/2+0x150>. []. #Cp<io_lib_format:fwrite/2+0x118>. 32. #Cp<io_lib:format/2+0x80>. []. #Cp<wombat_plugin_code_tracer:handle_info/2+0x318>. #1:[[{timeout,115},{port_op,proc_sig}],[Cannot access memory at address 0x7f1c86142e00

Oops, it looks like a corrupted term is present on the stack and moreover it matches the address of the ptr variable!

To have a full picture let’s use a macro that doesn’t try to pretty-print the terms to see the full process’ stack:

(gdb) etpf-stackdump p % Stackdump (27): #Cp<io_lib_format:collect/2+0x90>. #Cp<io_lib_format:collect/2+0x150>. []. #Cp<io_lib_format:fwrite/2+0x118>. 32. #Cp<io_lib:format/2+0x80>. []. #Cp<wombat_plugin_code_tracer:handle_info/2+0x318>. <etpf-cons 0x7cfa8121>. <-- *this is corrupted* <etpf-cons 0x6cd4ef89>. #Catch<364>. #Cp<wombat_plugin:handle_info/2+0xc78>. <etpf-cons 0x7cfa8131>. []. <etpf-boxed 0x7d1770ca>. #Cp<gen_server:try_dispatch/4+0xa0>. <etpf-boxed 0x7d1770fa>. #Catch<1386>. #Cp<gen_server:handle_msg/5+0x6b8>. #Catch<174>. #Cp<proc_lib:init_p_do_apply/3+0x58>. wombat_plugin. <etpf-boxed 0x7d1770fa>. <0.2845.0>. <0.2759.0>. <etpf-boxed 0x7d17714a>. #Cp<terminate process normally>. #Catch<126>.

Opcodes, registers and stack

The Erlang BEAM file (it’s the compiler’s output from the .erl file, more on how to generate a human-readable form of it in a moment) consists of functions (exported, local and anonymous functions defined in the module) and their code in the Erlang assembly.

In simple terms, the Erlang assembler code consists of low-level operations (you can see what instructions are understood by the Erlang VM here).

Besides instructions and labels, the assembler interpreter also uses registers for the proper control of an Erlang process flow. They are used to call functions (arguments are stored in registers 0, 1, … and the function call result is always stored in the register 0). Code references to registers as {x,0} to {x,255} .

Since some terms need to survive function calls, stack is used to memorize them. Code references to the stack elements as {y,0} , {y,1} and so on, while the {y,0} element is the first element from the top accessible to the code on the stack.

The interpreter also needs to keep track of an Instruction Pointer (IP) that is the currently executed instruction in the module and a Continuation Pointer (CP) that points to the code that will be executed after returning from the currently executed function.

When an Erlang function realizes (actually, it was the compiler) that it will call another function (in a non tail-recursive way) it allocates some space on the stack (by executing the allocate instruction) in order to migrate terms that need to survive this call. It also has to save the current value of CP on top of the stack (that’s why not the whole stack is accessible by the assembly code), as the CP value will be replaced by an address of the current function (next instruction after an instruction from the call family).

Let me explain this on an example execution of a simple module:

max ( A , B , C ) -> max ( max ( A , B ), C ). max ( A , B ) when A > B -> A ; max ( _A , B ) -> B .

The (simplified) assembly code and the execution of a max(7, 5, 9) call is explained on the following slides. State of all registers and pointers is shown after the particular instruction is executed.

Erlang assembly from rstudnicki from

So the dumped stack (just the relevant part) in a more convenient form (stackframes have been separated) looks like this:

+--------------------------------------------------------------------------------------------+ |Stack element |Ref|Function that the frame belongs to | +--------------------------------------------------------------------------------------------+ top |#Cp<io_lib_format:collect/2+0x150>. |CP |#Cp<io_lib_format:collect/2+0x90>. | . |[]. |y0 | | / \ +--------------------------------------------------------------------------------------------+ | |#Cp<io_lib_format:fwrite/2+0x118>. |CP |#Cp<io_lib_format:collect/2+0x150>.| | |32. |y0 | | | +--------------------------------------------------------------------------------------------+ | |#Cp<io_lib:format/2+0x80>. |CP |#Cp<io_lib_format:fwrite/2+0x118>. | | |[]. |y0 | | | +--------------------------------------------------------------------------------------------+ | |#Cp<wombat_plugin_code_tracer:handle_info/2+0x318>. |CP |#Cp<io_lib:format/2+0x80>. | | |<etpf-cons 0x7cfa8121> <- *this is corrupted* |y0 | | | |<etpf-cons 0x6cd4ef89>. |y1 | | | |#Catch<364>. |y2 | | | +--------------------------------------------------------------------------------------------+ bottom

In order to find out what’s the corrupted term we need to dive into the assembly code of the io_lib:format/2 . You can generate a human-friendly Erlang assembly with the following command: erlc -S lib/stdlib/src/io_lib.erl in the OTP source directory.

And here’s how the assembly of the module looks like. The snippet ends at the call to the io_lib_format:fwrite function, that we know for sure we didn’t return from.

1 { function , format , 2 , 8 }. 2 { label , 7 }. 3 { line ,[{ location , " io_lib.erl " , 151 }]}. 4 { func_info ,{ atom , io_lib },{ atom , format }, 2 }. 5 { label , 8 }. 6 { allocate , 3 , 2 }. 7 { 'catch' ,{ y , 2 },{ f , 9 }}. 8 { move ,{ x , 1 },{ y , 0 }}. 9 { move ,{ x , 0 },{ y , 1 }}. 10 { line ,[{ location , " io_lib.erl " , 152 }]}. 11 { call_ext , 2 ,{ extfunc , io_lib_format , fwrite , 2 }}.

We can tell that the {y,0} stack slot is occupied by what originally was in the {x,1} cell (line 8 of the snippet, the second argument of the function is moved to the top of the stack). Since we are inside the io_lib:format/2 function, this in fact is the data for a format string.

The function was called by wombat_plugin_code_tracer:handle_info/2 , like this (this is taken from Wombat’s code):

io_lib : format ( " ~p subject info ~p " , [ Info , subject_info ( Subject )])];

Now we just don’t know yet what message originated the call, since in the Wombat code there are many clauses that match on different messages. But let’s recall what the corrupted term looked like:

[[{ timeout , 115 },{ port_op , proc_sig }],[ Cannot access memory at address 0 x7f1c86142e00

It turns out that such term can be generated only by a port long schedule notification (from documentation), thus the corrupted term is a result of a call to subject_info/1 . In case of a port it’s a result of a call to erlang:port_info/1 , again from the Wombat code:

subject_info ( Subject ) when is_port ( Subject ) -> erlang : port_info ( Subject );

Well done, we have found out exactly what call corrupts the memory inside the Erlang VM.

Solution

If something is wrong with the erlang:port_info/1 function, let’s move to its source code. But first let’s use Google to check if by pure luck, someone else has already noticed any issue with this function.

Luckily enough, a race condition in this function has recently been fixed. After migrating to Erlang/OTP 17.5 the segmentation fault didn’t happen again. You have to admit this was a heck of a beginner’s luck.

What if…?

If you are having trouble with reproducing the segmentation fault but still want to debug your system on the gdb level, you can generate a core dump from a process that is currently running using the gcore command. Alternatively, you can attach gdb to the currently running process and play with it. Keep in mind that this freezes BEAM for the duration of the whole gdb session, things such as timers will be affected after resuming the process.

On the other hand, if you want to learn more about the parts of the Erlang VM that are not covered by the built-in macros you can still create your own! You will be required to dig into the BEAM’s C codebase in order to learn what C structure fields you’re going to access and what variables you need to decode.

You can see an example of both (attaching to the running BEAM and fetching information using a custom macro) here. In some other project I needed to verify somehow that timers inside the Erlang VM were synchronizing. Because I couldn’t find any existing functions to help me with this I decided to write a gdb macro.

Summary

gdb and a set of macros that are shipped with Erlang VM can be really useful for both post-mortem analysis and debugging a running system. Although some knowledge of BEAM internals is required for this task, you can learn a lot just from experimenting with the currently existing macros and then implementing your own.

Acknowledgments

My (many) thanks for help during my first core dump analysis go to Lukas Larsson and Simon Zelazny.