Removing support for Emacs unexec from Glibc

Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

The Emacs editor requires a lot of Lisp code and program state before it can start doing its job. That led Emacs developers to add the "unexec" feature to quickly load all of that at startup, but unexec has always been something of a hack. It employs a fairly ugly (and intrusive) mechanism to do its job. Some non-standard extensions to the GNU C library (Glibc) are required, so a plan to eventually eliminate those extensions was met with some dismay in the Emacs community.

As part of the Emacs build process, a simpler version of the editor, called "temacs", is built. That program consists of all of the C files in Emacs, which comprise the Emacs Lisp interpreter and not much else. It is then run to load the standard Lisp startup files and to dump a copy of the running program. That dump is then used as the binary that users invoke when they want to run Emacs.

The mechanism used is in the Emacs unexec() function that converts the running program into a new executable. To do that, it needs to handle memory that was allocated by malloc() , which also requires that the Glibc internal tracking and housekeeping data structures for memory allocation be preserved. The dumping (and restoring) mechanism uses malloc_get_state() and malloc_set_state() to do that, which are the extensions that Glibc developers would like to eliminate.

In mid-January, Florian Weimer posted a heads-up message about the change to the emacs-devel mailing list. He said that it was likely coming this year and was being done to allow changes to the "heap layout between glibc releases, for standards conformance, performance and security improvements". The intent is that existing Emacs binaries will still continue to work, but that at some point the Emacs build would have to change, he continued. He also noted that supporting existing binaries "causes a significant ongoing maintenance overhead for glibc upstream, basically maintaining a separate malloc implementation indefinitely".

Emacs maintainer John Wiegley voiced his concerns that the alterations needed to Emacs would be "a rather significant change to low-level code that has been functioning for a very long time". He suggested extending the timeline for the change and discussing ways to provide the same or similar functionality. Weimer, though, believes that it is "time to tackle it at the root" by fixing Emacs, which is seen as the only user of the interfaces:

If you cannot make the changes yourself, we'll have to make sure that you have a replacement available when the need arises. Even if we have to do the development, it will pay off fairly soon on the toolchain side.

A change "forced" on Emacs by another GNU project was always likely to get the attention of Richard Stallman. He emailed the private—and largely unused—mailing list for the Glibc steering committee (glibc-sc), which Mark Brown reposted to the normal Glibc mailing list (libc-alpha). The steering committee does not really exist in the form of that list anymore, which led Carlos O'Donell to suggest that the mailing list be closed. In the reposted message, Stallman largely echoed Wiegley:

When you consider making a change that will break other GNU packages in a way that is hard to fix, you must not decide unilaterally. All the more so, when the package is as important as Emacs. We need package maintainers to cooperate. Please have a discussion with the Emacs developers and decide together which course of action is best for the GNU system, and what time scale for that action fits the release schedules best.

Stallman also mentioned that he had contacted the Glibc maintainers in the emacs-devel thread. That led Weimer to wonder why it was appropriate to move the discussion to a private list: "This doesn't really match how glibc development proceeds today." Stallman, though, thought that it would be better discussed in private: "This a sensitive issue; it is best to discuss it without an audience."

That particular cat was out of the bag at that point, however. The conversation in emacs-devel proceeded by looking at the dump/load functionality, whether it is still needed, and ways to implement it that don't require intimate knowledge of the internals of Glibc's memory-allocation techniques.

Ali Bahrami, who works on the Solaris linker, wondered if it even made sense to continue to support the unexec functionality. Computers have gotten a lot faster since that optimization was added, so it might make sense to simply leave it behind:

Before you fight to to save unexec, I'd encourage you to measure the impact, and see if it still matters. If it does, then it would be worthwhile to consider other means for getting those bytes into memory quickly that don't involve second guessing object layout, memory allocation, and process layout. Speaking as a linker guy, linking is only going to get more dynamic, and more complex, going forward. You might be glad, down the road, to be out of that game.

So Bahrami and others set out to measure the difference between starting Emacs and starting temacs (which must load all of the different startup files). It became clear that there is still a substantial difference; a half second versus more than five seconds, though that depends on various factors. Different tricks were tried, with some success, but the main problem remained.

David Caldwell suggested one possible approach: taking the compiled Lisp files ( .elc files) that temacs loads and adding them into the binary.

What if one were to smash all the .elc files that temacs loads as part of the undump process and then put them into the temacs object, maybe by relinking them in as a giant array (portable), or by shoving them into their own section with objcopy (not really portable)?

Stallman seemed interested in that idea, "but the real test is in implementing it."

According to Paul Eggert, making unexec more portable has been on the to-do list for a while, "and this will light more of a fire under it". Concerns that Emacs might not build using a new Glibc API (which has not even been written yet) that came up earlier in the thread are not a problem, he said. "Emacs should still build and run even if the glibc API is changed, as Emacs ./configure probes for the glibc malloc-related API and falls back on its own malloc implementation otherwise."

In another message, Eggert outlined how Emacs uses unexec and how it might be able to get along without it: "Emacs could live without the current unexec in a semi-portable way by doing what XEmacs does, which is to write out data and mmap it in later". He does not know the details of the XEmacs approach, though, and suggested other possibilities as well.

The discussion continued, looking at various ways for Emacs to accomplish its goals without requiring Glibc to maintain the same heap layout forever. If there was a need to consider the matter privately, it certainly wasn't apparent in the thread. As with many changes of this sort, developers on both sides of the change simply worked things out. Perhaps there will be glitches down the road, but there was plenty of notice and it seems like a direction that the Emacs developers wanted to go in anyway, so it is hard to see any major potholes in the road ahead.