Fedora reawakens the hibernation debate

Benefits for LWN subscribers The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

Behavioral changes can make desktop users grumpy; that is doubly true for changes that arrive without notice and possibly risk data loss. Such a situation recently arose in the Fedora 29 development branch in the form of a new "suspend-then-hibernate" feature. This feature will almost certainly be turned off before Fedora 29 reaches an official release, but the discussion and finger-pointing it inspired reveal some significant differences of opinion about how this kind of change should be managed.

Unexpected hibernation

Fedora tester Kamil Paral recently noticed a change in laptop power-management behavior in the bleeding-edge Fedora repository. A laptop that had been suspended in the evening would turn out to be hibernated the following morning. "Suspended", in this context, means that the system as a whole has been powered down, but power to memory remains active so that its contents are preserved. Such a system can be quickly restored to an operational state by powering up the rest of the hardware; all applications that were running before will still be there. "Hibernated", instead, means that the contents of main memory have been written to persistent storage so that the entire system can be powered down. Restoring the system requires reading the hibernation image back into memory from storage.

The new suspend-then-hibernate feature attempts to provide the best of both worlds by suspending the system initially, then automatically hibernating three hours after suspension. The motivation for this feature is clear: a suspended system continues to drain its battery (albeit slowly), while a hibernated system does not. A hibernated system left without power for weeks can be expected to resume properly; a suspended system will typically lose everything after a handful of days. This behavior has some clear advantages, but it also raises a number of concerns, especially as an unannounced change with no easy way for the user to change it.

One reason why many users may dislike this change was expressed by Adam Williamson. A suspended system is almost immediately ready for work upon resume; a hibernated system, instead, may require a significant amount of time to read the image back into memory, followed by a period of relatively sluggish behavior as the kernel's page cache is repopulated. For users who are not concerned about the extra power used by suspend — because the system is left plugged in or because they know they will be resuming it before the battery gets low — this extra resume latency can be bothersome while providing no real advantages.

That, on its own, would be a good argument for giving users control over whether a system uses suspend-then-hibernate or simply suspends. But there is a more compelling reason as well: this feature depends on hibernation working correctly, and hibernation is not one of the best-supported features in the Linux kernel.

Hibernation, initially called "software suspend" or "suspend to disk" in kernel circles, was once a hot topic in kernel development — back in 2004. A lot of effort went into making it work, and heated battles were fought over which of two competing implementations should be in the mainline. That discussion slowed considerably around ten years ago, though, when modern suspend-to-RAM functionality became reliable. For many of us, hibernation was only something we used because suspending was not available; once the latter worked, many people never looked back.

As a result, work on hibernation dropped off, to the point that almost nothing has happened in that area in the last decade and few people test it. Some kernel features can survive years of neglect and still work well; hibernation is not one of them. It is a low-level feature that can be defeated by quirks in almost any part of the system, and there are a lot of quirky systems out there. Hibernation is the sort of feature that has to be made to work separately on every new machine. So, while hibernation works on many well-behaved machines, it is rather less reliable on many others. The only way to know for sure is to try it and see if the system resumes reliably over time. Even if hibernation generally works on a particular system, the use of other features, such as UEFI secure boot or encrypted disks, is likely to break it.

What now?

The Linux community is full of users who are happy to experiment with this kind of feature and know how to avoid losing data if something explodes. Silently hibernating all systems after three hours of suspension, though, will widely expand the group performing such experiments, bringing in users who are rather less adventurous and who are not prepared for things to fail when they open their laptops in the morning. It is, in other words, the kind of feature that seems likely to swell the ranks of former desktop Linux users.

Most of the participants in the discussion were fairly quick to reach the conclusion that suspend-then-hibernate is not an appropriate feature to silently add to every user's system with no option to turn it off. Some, such as Matthias Clasen disagreed, though, saying: "If it works, why should we make it configurable? I don't think anybody wants their battery drained". But from there opinions diverged somewhat.

The addition of this feature to Fedora came about in two steps. The first was addition of a suspend-to-hibernate command (later renamed suspend-then-hibernate ) to systemd. The GNOME developers noticed this feature, and added a patch to automatically use it, instead of ordinary suspend, when it is available. Since GNOME chose to start using this feature, and since GNOME provides the control interface that users see, it seems natural to think that GNOME's interface should provide control over whether suspend-then-hibernate is used. But, it appears, the GNOME developers disagree with that idea.

In particular, two GNOME developers, Bastien Nocera and Clasen, argued that if this particular systemd feature does not work reliably, it should be disabled in systemd rather than in GNOME. Neither seems to see any other reason why users might want to disable its use or have any control over it in general. Nocera seems to have a longer-term grudge against Fedora's handling of power management that isn't helping here. Matthew Miller, meanwhile, has argued for some sort of easy user control that "ideally does not involve the command line", which would suggest that something needs to be made available in the GNOME interface somewhere if the feature is not to be disabled altogether.

As of this writing, what will happen next is not entirely clear. The opposition from the GNOME developers makes it relatively unlikely that control over suspend-then-hibernate will be provided at that level. In the short term, that may not matter, as few people who have looked at the situation seem to think that this feature is ready to inflict on users. So the right answer for Fedora 29 may be to just turn it off entirely.

For the longer term, a few different approaches have been suggested. Lennart Poettering posted a list of changes that, he thinks, might make the feature safe to turn on. Chris Murphy, instead, has proposed that GNOME adopt an API allowing applications to save and restore their state; he pointed out that systems like Android do not use hibernation, and suggested that GNOME-based systems should take a similar approach. That, however, would require a lot of work and is clearly not a short-term solution. Williamson suggested looking at "hybrid sleep" (where the system writes the hibernation image immediately, then suspends indefinitely) instead. Hybrid sleep would retain the resume latency of suspend, while only using the hibernation image as a last resort should the battery run out. Which of these ideas, if any, will gain traction is unclear at this point.

One of Fedora's goals is to be the first with useful new features, but this particular one appears to not be ready even for many early adopters. Indeed, unless priorities change and some serious effort goes into developing and testing hibernation, it may never truly be safe for the user community as a whole. But there is clearly an appetite for better desktop power management, so it is good that developers are starting to explore this space again.

