Is pre-linking worth it?

This article brought to you by LWN subscribers Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

The recent problem with prelink in Fedora Rawhide has led some to wonder about what advantages pre-linking actually brings—and whether those advantages outweigh the pain it can cause. Pre-linking can reduce application start up time—and save some memory as well—but there are some downsides; not least the possibility of an unbootable system as some Rawhide users encountered. The advantages are small enough, or hard enough to completely quantify, that it leads to questions about whether it is justified as the default for Fedora.

Linux programs typically consist of a binary executable file that refers to multiple shared libraries. These libraries are loaded into memory once and shared by multiple executables. In order to make that happen, the dynamic linker (i.e. ld.so ) needs to change the binary in memory such that any addresses of library objects point to the right place in memory. For applications with many shared libraries—GUI programs for example—that process can take some time.

The idea behind pre-linking is fairly simple: reduce the amount of time the dynamic linker needs to spend doing these address relocations by doing it in advance and storing the results. The prelink program processes ELF binaries and shared libraries in much the same way that ld.so would, and then adds special ELF sections to the files describing the relocations. When ld.so loads a pre-linked binary or library, it checks these sections and, if the libraries are loaded at the expected location and the library hasn't changed, it can do its job much more quickly.

But there are a few problems with that approach. For one thing, it makes the location of shared libraries very predictable. One of the ideas behind address space layout randomization (ASLR) is to randomize these locations each time a program is run—or library loaded—so that malicious programs cannot easily and reproducibly predict addresses. On Fedora and Red Hat Enterprise Linux (RHEL) systems, prelink is run every two weeks with a parameter to request random addresses to alleviate this problem, but they do stay fixed over that time period.

In addition, whenever applications or libraries are upgraded, prelink must be run again. The linker is smart enough to recognize the situation and revert to its normal linking process when something has changed, but the advantage that prelink brings is lost until the pre-linking is redone. Also, the kernel randomly locates the VDSO (virtual dynamically-linked shared object) "library", which, on 32-bit systems, can overlap one of the libraries, requiring some address relocation anyway. Overall, pre-linking is a bit of a hack, and it is far from clear that its benefits are substantial enough to overcome that.

Fedora and Red Hat Enterprise Linux (RHEL) enable pre-linking by default, while most other distributions make prelink available, but seem unconvinced that the benefits are substantial enough to make it the default. Because it is a very system-dependent feature, hard performance numbers are difficult to find. It certainly helps in some cases, but is it really something that everyone needs?

Matthew Miller brought that question up on the fedora-devel mailing list:

I see [prelink] as adding unnecessary complexity and fragility, and it makes forensic verification difficult. Binaries can't be verified without being modified, which is far from ideal. And the error about dependencies having changed since prelinking is disturbingly frequent. On the other hand, smart people have worked on it. It's very likely that those smart people know things I don't. I can't find any good numbers anywhere demonstrating the concrete benefits provided by prelink. Is there data out there? [...] Even assuming a benefit, the price may not be worth it. SELinux gives a definite performance hit, but it's widely accepted as being part of the price to pay for added security. Enabling prelink seems to fall on the other side of the line. What's the justification?

Glibc maintainer Ulrich Drepper noted that pre-linking avoids most or all of the cost of relocations, while also pointing out that the relatively new symbol table hashing feature in GCC reduces the gain for pre-linking. He also described an additional benefit: memory pages that do not require changes for relocations will not be copied (due to copy-on-write) and can thus be shared between multiple processes running the same executable. But his primary motivation may have more to do with his work flow: "Note, also small but frequently used apps benefit. I run gcc etc a lot and like every single saved cycle."

The effect of pre-linking can be measured by using the LD_DEBUG environment variable as Drepper described. Jakub Jelinek, who is the author of prelink , posted some results for OpenOffice.org Writer showing an order of magnitude difference in the amount of time spent doing relocations between pre-linked and regular binaries. Those results are impressive, but, at least for long-running programs, start up time doesn't really dominate—desktop applications, or often-used utilities, are the likely benefactors. As Miller puts it:

If I can get a 50% speed up to a program's startup times, that sounds great, but if I then leave that program running for days on end, I haven't actually won very much at all -- but I still pay the price continuously. (That price being: fragility, verifiability, and of course the prelinking activity itself.)

For 32-bit processors, though, which are those most likely to benefit from the memory savings, there is still the VDSO overlap problem. John Reiser did an experiment using cat and found that glibc needed to be dynamically relocated fairly frequently:

This means that glibc must be dynamically relocated about 10% of the time anyway, even though glibc has been pre-linked, and even though /bin/cat is near minimal in its use of shared libraries. When a GNOME app uses 50 or more pre-linked shared libs, as claimed in another thread on this subject, then runtime conflict and expense are even more likely.

There doesn't seem to be a large interest in removing the prelink default for Fedora, but one has to wonder, if the savings are as large and widespread as people seem to think, why other distributions have been reluctant to adopt it. Part of the reason may be the possibility of a prelink bug rendering systems unbootable or reluctance to rely upon something that requires modifying binaries and libraries, regularly, to keep everything in sync. The security issues may also play into their thinking, though Jelinek argues that security-sensitive programs should be position-independent executables (PIE) that are not pre-linked, and thus have ASLR done for every execution.