In this post, we demonstrate how to use the GNU Build ID to uniquely identify a build. We explain what the GNU build ID is, how it is enabled, and how it is used in a firmware context.

Much has been written on how to craft a firmware version. From Embedded Artistry’s excellent blog post, to Wolfram Rösler’s how to.

Versions are a great way to identify a release: a set of interfaces and capabilities bundled together.

Versions do not - however - identify a specific binary all that well. For example, you could have multiple binaries for a given version in order to accommodate different variants of your hardware in the field.

For this, we need something else. This is where the GNU build ID comes in.

Why would we want to identify a specific binary? A few cases:

To match a set of debug symbols to a given binary when trying to debug a device

To verify that two binaries are in fact the same build

What is the GNU Build ID

Firmware engineers are not alone in wanting to uniquely identify a build for debugging purposes. In fact, Linux developers have long wanted to match a coredump to a specific build.

Roland McGrath of glibc fame came up with the GNU build ID 15 years ago, and contributed the implementation to various build tools.

The build ID is a 160-bit SHA1 string computed over the elf header bits and section contents in the file. It is bundled in the elf file as an entry in the notes section.

Each note section entry has the following layout:

+----------------+ | namesz | 32-bit, size of "name" field +----------------+ | descsz | 32-bit, size of "desc" field +----------------+ | type | 32-bit, vendor specific "type" +----------------+ | name | "namesz" bytes, null-terminated string +----------------+ | desc | "descsz" bytes, binary data +----------------+

In the case of the GNU build ID:

name is "GNU\0" , which gives us namesz = 4

is , which gives us = 4 desc is our 160-bit SHA1 value, which gives us descsz = 20

is our 160-bit SHA1 value, which gives us = 20 type is 3

Adding the GNU build ID to your builds

Note: all of our example are based on the minimal program from our Zero to main() series.

In GCC, you can enable build IDs with the -Wl,--build-id which passes the --build-id flag to the linker. You can then read it back by dumping the notes section of the resulting elf file with readelf -n .

By default, this is not enabled:

francois-mba:minimal francois$ make clean all ... arm-none-eabi-size build/minimal.elf text data bss dec hex filename 1252 0 8192 9444 24e4 build/minimal.elf ... francois-mba:minimal francois$ arm-none-eabi-readelf -n build/minimal.elf [No output]

But a small change to the CFLAGS is all it takes:

francois-mba:minimal francois$ CFLAGS = "-Wl,--build-id" make clean all build/minimal.elf ... arm-none-eabi-size build/minimal.elf text data bss dec hex filename 1288 0 8192 9480 2508 build/minimal.elf francois-mba:minimal francois$ arm-none-eabi-readelf -n build/minimal.elf Displaying notes found in: .note.gnu.build-id Owner Data size Description GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: bab6b09f86b3c3017499d8e386447a610c559bd5

As you can see, our binary now contains the build id bab6b09f86b3c3017499d8e386447a610c559bd5 .

Bundling the GNU build ID in firmware bin files

Getting the build ID in your executables on Linux is easy: they are ELF files! Firmware on the other hand typically deals with binaries which are assembled by copying relevant sections of the elf at the right offset in a file.

This is typically accomplished with objcopy:

$ arm-none-eabi-objcopy firmware.elf firmware.bin -O binary

This takes every elf section earmarked to be loaded and places them at the correct offset in the bin file. In the process, most debug sections are stripped out.

Dumping the elf sections of the resulting minimal.elf gives us:

$ arm-none-eabi-objdump -h build/minimal.elf build/minimal.elf: file format elf32-littlearm Sections: Idx Name Size VMA LMA File off Algn 0 .note.gnu.build-id 00000024 00000000 00000000 00010000 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 1 .text 000004e4 00000024 00000024 00010024 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 2 .bss 00000000 20000000 20000000 00000000 2**0 ALLOC 3 .data 00000000 20000000 20000000 00010508 2**0 CONTENTS, ALLOC, LOAD, DATA 4 .stack 00002000 20000000 20000000 00020000 2**0 ALLOC 5 .debug_info 00004076 00000000 00000000 00010508 2**0 CONTENTS, READONLY, DEBUGGING 6 .debug_abbrev 00000ac3 00000000 00000000 0001457e 2**0 CONTENTS, READONLY, DEBUGGING 7 .debug_aranges 000000f8 00000000 00000000 00015041 2**0 CONTENTS, READONLY, DEBUGGING 8 .debug_ranges 000000b8 00000000 00000000 00015139 2**0 CONTENTS, READONLY, DEBUGGING 9 .debug_line 00000cfa 00000000 00000000 000151f1 2**0 CONTENTS, READONLY, DEBUGGING 10 .debug_str 0000111a 00000000 00000000 00015eeb 2**0 CONTENTS, READONLY, DEBUGGING 11 .comment 00000075 00000000 00000000 00017005 2**0 CONTENTS, READONLY 12 .ARM.attributes 0000002c 00000000 00000000 0001707a 2**0 CONTENTS, READONLY 13 .debug_frame 00000298 00000000 00000000 000170a8 2**2 CONTENTS, READONLY, DEBUGGING

As you can see, the .text , .bss , .data , .stack sections each have the LOAD attribute, all others (including our .note.gnu.build-id ) do not and will be discarded.

To add a section to our binary, we must specify an address for it in our linker script. Assuming your linker script declares the following memory layout:

MEMORY { rom (rx) : ORIGIN = 0x00000000, LENGTH = 0x00040000 ram (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000 }

You can add the build ID to your flash memory with:

.gnu_build_id : { PROVIDE(g_note_build_id = .); *(.note.gnu.build-id) } > rom

This instructs the linker to append the contents of .note.gnu.build-id to the rom region of memory and create a symbol ( g_note_build_id ) to point to it.

Let’s check our elf sections now:

$ arm-none-eabi-objdump -h build/minimal.elf build/minimal.elf: file format elf32-littlearm Sections: Idx Name Size VMA LMA File off Algn 0 .text 000004e4 00000000 00000000 00010000 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .gnu_build_id 00000024 000004e4 000004e4 000104e4 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .bss 00000000 20000000 20000000 00000000 2**0 ALLOC 3 .data 00000000 20000000 20000000 00010508 2**0 CONTENTS, ALLOC, LOAD, DATA 4 .stack 00002000 20000000 20000000 00020000 2**0 ALLOC 5 .debug_info 00004076 00000000 00000000 00010508 2**0 CONTENTS, READONLY, DEBUGGING 6 .debug_abbrev 00000ac3 00000000 00000000 0001457e 2**0 CONTENTS, READONLY, DEBUGGING 7 .debug_aranges 000000f8 00000000 00000000 00015041 2**0 CONTENTS, READONLY, DEBUGGING 8 .debug_ranges 000000b8 00000000 00000000 00015139 2**0 CONTENTS, READONLY, DEBUGGING 9 .debug_line 00000cfa 00000000 00000000 000151f1 2**0 CONTENTS, READONLY, DEBUGGING 10 .debug_str 0000111a 00000000 00000000 00015eeb 2**0 CONTENTS, READONLY, DEBUGGING 11 .comment 00000075 00000000 00000000 00017005 2**0 CONTENTS, READONLY 12 .ARM.attributes 0000002c 00000000 00000000 0001707a 2**0 CONTENTS, READONLY 13 .debug_frame 00000298 00000000 00000000 000170a8 2**2 CONTENTS, READONLY, DEBUGGING

As you can see, our build ID now has an address assigned to it and is marked with the LOAD attribute.

Note: Make sure to declare the .gnu_build_id section after the .text section, otherwise the build ID will be set to address 0x0 and the firmware will not boot.

Reading the build ID in firmware

Once the build ID is in our binary, we need to modify our firmware to read it and, at the very least, print it over serial at boot.

From the linker script, we know that we will find the build ID section at &g_note_build_id . From the spec, we know the structure of the section and can write down a typedef:

typedef struct { uint32_t namesz ; uint32_t descsz ; uint32_t type ; uint8_t data []; } ElfNoteSection_t ; extern const ElfNoteSection_t g_note_build_id ;

We can now simply index into the data field and print the build ID data.

void print_build_id ( void ) { const uint8_t * build_id_data = & g_note_build_id . data [ g_note_build_id . namesz ]; printf ( "Build ID: " ); for ( int i = 0 ; i < g_note_build_id . descsz ; ++ i ) { printf ( "%02x" , build_id_data [ i ]); } printf ( "

" ); }

Calling this code on boot, we get:

... Build ID: 8d7aec8b900dce6c14afe557dc8889230518be3e ...

Update: A prior version of the above code was incorrect: g_note_build_id was declared as a pointer which would lead to random data being read in the best case, and a crash in the worst case. Thanks to Simon Doppler for reporting the problem.

References