Striking gold in binutils

Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

A new linker is not generally something that arouses much interest outside of the hardcore development community—or even inside it—unless it provides something especially eye-opening. A newly released linker, called gold has just that kind of feature, though, because it runs up to five times as fast as its competition. For developers who do a lot of compile-link-test cycles, that kind of performance increase can significantly increase their efficiency.

Linking is an integral part of code development, but it can be invisible, as it is often invoked by the compiler. The sidebar accompanying this article is meant for non-developers or those in need of a refresher about linker operation. For those who want to know even more, the author of gold , Ian Lance Taylor, has a twenty-part series about linker internals on his weblog, starting with this entry.

For Linux systems, the GNU Compiler Collection (GCC) has been the workhorse by providing a complete toolchain to build programs in a number of different languages. It uses the ld linker from the binutils collection. With the announcement that gold has been added to binutils, there are now two choices for linking GCC-compiled programs.

A linker overview For non-developers, a quick overview of the process that turns source code into executable programs may be helpful. Compilers are programs that turn C—or other high-level languages—into object code. Linkers then collect up object code and produce an executable. Usually the linker will not only operate on object code created from a project's source, but will also reference libraries of object code—the C runtime library libc for example. From those objects, the linker creates an executable program that a user can invoke from the command line. The linker allows program code in one file to refer to a code or data object in another file or library. It arranges that those references are usable at run time by substituting an address for the reference to an object. This "links" the two properly in the executable. Things get more complicated when considering shared libraries, where the library code is shared by multiple concurrent executables, but this gives a rough outline of the basics of linker operation.

The intent is for gold to be a complete drop-in replacement for ld —though it is not quite there yet. It is currently lacking support for some command-line options and Linux kernels that are linked with it do not boot, but those things will come. It also currently only supports x86 and x86_64 targets, but for many linker jobs, gold seems to be working well. The speed seems to be very enticing to some developers, with Bryan O'Sullivan saying:

When I switched to using gold as the linker, I was at first a little surprised to find that it actually works at all. This isn't especially common for a complicated program that's just been committed to a source tree. Better yet, it's as fast as Ian claims: my app now links in 2.6 seconds, almost 5.4 times faster than with the old binutils linker!

Performance was definitely the goal that Taylor set for gold development. It supports ELF (Executable and Linking Format) objects and runs on UNIX-like operating systems only. Only supporting one object/executable format, along with a fresh start and an explicit performance goal are some of the reasons that gold outperforms ld .

Tom Tromey likes the looks of the code:

I looked through the gold sources a bit. I wish everything in the GNU toolchain were written this way. It is very clean code, nicely commented, and easy to follow. It shows pretty clearly, I think, the ways in which C++ can be better than C when it is used well.

Because the implementation is geared for speed, Taylor used techniques that may confuse some. He has some concerns about the maintainability of his implementation:

While I think this is a reasonable approach, I do not yet know how maintainable it will be over time. State machine implementations can be difficult for people to understand, and the high-level locking is vulnerable to low-level errors. I know that one of my characteristic programming errors is a tendency toward code that is overly complex, which requires global information to understand in detail. I've tried to avoid it here, but I won't know whether I succeeded for some time.

Overall, it seems to be getting a nice reception by the community, with O'Sullivan commenting that he is "looking forward to the point where gold entirely supplants the existing binutils linker. I expect that won't take too long, once Mozilla and KDE developers find out about the performance boost." Once gold gets to that point, Taylor is already thinking about concurrent linking—running compiler and linker at the same time—as the next big step.

There are two other ongoing projects that are working with the greater GCC ecosystem in interesting ways: quagmire and ggx. Quagmire is an effort to replace the GNU configure and build system—consisting of autoconf, automake, and libtool—with something that depends solely on GNU make. Currently, that system uses various combinations of the shell, m4, and portable makefiles to make the building and installation of programs easy—the famous " ./configure; make " command line. The tools were written that way to try and ensure that users did not need to install additional packages to configure and build GNU tools. Quagmire, which has roots in a posting by Taylor recognizes that GNU make is ubiquitous, so basing a system around that makes a great deal of sense.

The ggx project is Anthony Green's step-by-step procedure to create an entire toolchain that can build programs for a processor architecture that he is creating as a thought experiment. The basic idea is to design the instruction set based on the needs of the compiler, in this case GCC, rather than the needs of the hardware designers. He is using GCC's ability to be retargeted for new architectures, along with its simulation capabilities to create a CPU that he can write programs for. As of this writing, he has a "hello world" program working, along with large chunks of the GCC test suite passing. Well worth a look.