There’s a vast amount of material out there on boot times and people showcasing boot times of as little as one second [1]. But the reality is often different for many devices in the field: some devices boot in 10s or less, others take over 3 minutes. Here’s a handful of devices I measured:

Raspberry Pi 2 Model B with Raspbian GNU/Linux 8 11s to shell prompt Garmin Nüvi 42 Sat Nav 14s (detects power off after 9s) Beaglebone Black with Angstrom Distribution 17s to shell prompt PC booting Ubuntu 14.04 with KDE UI (no login) 37s Android 5.1 Moto X smart phone 42s PC booting Fedora 19 with Gnome UI from a USB Stick 43s PC booting Mint 17.2 KDE from USB stick 90s Pace Linux set top box with secure boot, middleware + UI 180s Virgin Media TiVo box with secure boot, middleware + UI 190s

There’s a number of reasons why these boot times vary so drastically, and a number of things we can do to optimise boot time, but there is always a trade-off with functionality, and the development time and effort expended to make reductions.

Steps in Booting

Elinux.org has a good definition of terms [2], basically we can summarise the stages as follows.

Power up / hardware boot – often there is secure/trusted boot code in the core processor. Software boot using a Bootloader* which searches for a runtime image (Kernel + rootfs) to decompress, load and run, or download a new image if required. Sometimes the Bootloader is in two stages, with the second stage upgradeable. Kernel initialisation, decompress/mount root file system. User space initialisation. At the end of this a shell/login prompt is displayed. Application platform initialisation. This could be a Java virtual machine, set top box middleware, or a number of third party programs designed to support the user interface. User interface up and running.

*Note, older PCs use BIOS, in the past few years more are using UEFI secure boot.

If you’re running all the above, you will not get a 10s boot, at least not easily.

– Secure devices, e.g. set top boxes or mobile phones, rely on a secure boot loader which may take several seconds to check the image has been signed and authenticated.

– Decompression can take several seconds and depends on the size of the image and the decompression technology used.

– Devices and drivers will have dependencies, as will services set up in user space.

– Applications and their platforms can take a long time to load and run. This is synonymous with a PC boot where you’re asking for a number of programs (e.g. Skype) to run at startup.

Measuring Boot Time

Once booted, typing systemd-analyze at the debug terminal should give you a rough breakdown of boot times as follows:

Startup finished in 1.029s (kernel) + 2.517s (initrd) + 2min 17.383s (userspace)

Note the above example is on a Fedora PC which had loaded a Gnome user interface after 40s, things were still going on in the userspace boot for nearly 2 minutes afterwards! This is a recurring theme in Linux boots, the bootloader and kernel generally take a lot less time than userspace to start, and if userspace services take a long time to start, use trickery to make the device at least partially useable earlier.

systemd-analyze blame lists all the services running in order of time taken to initialize (longest first). This might give you a lead: a process that jumps out as taking too long to initialise, but it won’t give you the whole picture as we don’t know the order and dependencies of the services.

system-analyse critical-chain prints a tree of dependent units which form a critical path at boot time. But again it doesn’t give us a complete picture.

The most oft-used (and easiest) method is to insert time stamps into the debug print log. If your terminal program doesn’t support this, it can be enabled in the kernel using printk.time=1, but you will only get timings once the kernel is up and running. That said, you may not get any debug from the bootloader anyway, unless you run a debug version, which will likely run slower than the release version!

At this stage we’re looking at identifying the components which take the longest to initialise for a quick win, and a timestamped log is a good place to start. Let’s take a real example.

Effect of Middleware and Applications on Set Top Box Boot Time

Set Top Boxes’ cold boot time can be as much as 3 minutes and a substantial portion of this is down to middleware and user interface initialisation. Below is a boot log from a Set Top Box being upgraded from VxWorks to Linux prior to integration of the middleware. Note the time from power-up to decoding a DVB channel was 44 seconds.

[11:56:50.543] sysHwInit: enter // bootloader start (step 2)

[11:56:50.544] sysHwInit: exit

[11:56:50.546] sysHwInit2: enter // stage 2 loader (step 3)

[11:56:50.547] ******** SR =0xf000ff01

[11:56:50.549] sysHwInit2: exit

: // find the signed image, check signature and decompress it, takes approx. 23 seconds

:

[11:57:13.259] ABCD<5>Linux version 2.6.18-7.8 // kernel start (step 4)

:

[11:57:28.863] BCMDRV: Initialization complete… // Broadcom SoC drivers/API (step 5)

:

[11:57:31.740] TuneToCurrentChannel(RD1_Radio_ARMENIA) // platform initialised (step 6)

:

[11:57:34.138] Stream: type=3, pid=93 // tuned to first channel at t0 + 44s (step 7)

Compare this to a newer Set Top Box with a more powerful processor, but also with middleware and applications, which took 2 mins 50s to boot:

[09:15:40.927] [MAIN] Enter real main //bootloader start (step 2)

:

[09:16:13.583] [BLOADER] Start highlevel code…..

[09:16:20.724] Linux version 2.6.37-2.4 // kernel start (step 4)

:

[09:16:21.240] RPC: Registered tcp NFSv4.1 backchannel transport module.

: // unpacking rootfs at this point, LZMA decompression, takes approx. 40 seconds

:

[09:17:01.661] squashfs: version 4.0 (2009/01/31) Phillip Lougher

:

[09:17:06.646] /lib/modules/2.6.37-2.4/nandflash.ko // NAND Flash driver

:

[09:17:06.927] /lib/modules/2.6.37-2.4/kern_ext_D183.ko // off-chip peripherals

:

[09:17:07.271] /lib/modules/2.6.37-2.4/nexus.ko // Broadcom SoC drivers/API

:

[09:17:11.395] Starting networking // configuring network (step 5)

:

[09:17:28.567] Starting resman // resource manager

[09:17:29.161] Starting firewalld // firewall daemon

[09:17:29.302] Checking gstreamer registry is up to date // launch gstreamer

[09:17:29.380] Starting nexcadaemon // conditional access daemon for encrypted TV

:

[09:17:47.255] starting app // Starting Javascript User Application (step 6)

:

[09:18:29.192] … “1338508884855:Logic.TVPlayer.prototype.tuneToHomeMux: Successfully Tuned To Home Mux”

// can now decode TV signals at t0 + 169s (step 7)

So why does it take another 2 minutes for the box with middleware to boot?

The secure bootloader took 10s longer to boot the kernel. This is mainly due to the time taken to check and decompress the runtime image (kernel + rootfs), which is larger. Once the kernel is running, the decompression time of the rootfs adds another 40s. There are a lot of additional software components running in user space that need initialising: network and internet-related daemons, middleware components and plugins (e.g. gstreamer), graphics (e.g. DirectFB, Qt, Webkit), Conditional Access / Digital Rights Management, Java Virtual Machines, and things that run on top of them: the User Guide, IPTV VOD portals, 3rd party applications. The platform components take about 40s to set up, and the user interface application another 40s. The middleware relies on services on a live stream to populate its tables and the user guide, this is broadcast cyclically as part of a carousel, not all the information required is available immediately and the box must wait for the carousel cycle period to acquire all its data. This plays a substantial part in making the last 40s of the boot take so long.

How can we mitigate the effects of middleware dependencies on boot time?



The key components which take the most time need addressing first. In our example, these are:

25s bootloader time,

40s rootfs decompression time,

40s platform components initialisation time

40s user interface application initialisation time.

We might not be able to eliminate all of these, but potentially there’s over 2 minutes to save by having each component warm-booting with initialisation data:

Copy any data obtained from the live stream in FLASH and rely on the copied data at boot time, refreshing from live when available. Suspend to FLASH and resume from FLASH at boot time. This means storing a copy of the runtime image or the minimum initialised data it needs to run from. Set up an installation menu at first boot after factory reset to cover lengthier operations such as scanning, and set up the data needed for each boot as part of installation.

We can then have an installation setup that takes typically 5 minutes, and subsequent warm boots which can take a few seconds.

What if you want to make further savings?

Some key features of super-fast boots [3]:

1. They don’t rely on bootloaders,

2. They don’t use compression,

3. They strip out a lot of functionality.

Fast booting has a price: reduced functionality and security, and for products in the field these are important. But it is worth going through and disabling any unnecessary services and kernel modules. If you can get away with it, use fixed IP instead of DHCP, as it takes time to call out to a DHCP server and get an IP address. And if possible, avoid using udev where possible, e.g. use udev to generate once to a ramfs, which can then be stored as a tar file to be statically deployed with udev removed in subsequent boots.

What if you want to optimise still further?

The Linux Trace Toolkit (LTTng) contains a wealth of tracing tools, and a number of alternative tools exist

ftrace, the function tracer,

ptrace, the process tracer,

strace, the system call tracer,

ltrace, the library tracer

perf, the performance analysing tool.

But unless you have a particular part of the boot process that stands out as especially long or especially in need of optimisation, we’re starting to look at diminishing returns. Further savings in boot time need to be weighed against the development time, besides which, those nice folks maintaining the kernel are always looking at new ways to optimise performance and especially boot time, since these are issues that affect everyone using Linux.

And there is one final thing we can do to improve the user experience of boot time: use the art of deception to place a splash screen on a display – more on that later.

Secure Boot Code

If you’re serious about connected devices in the field, you won’t compromise on a secure boot.

PCs, mobile phones, tablets, set top boxes, all of these use secure boot mechanisms and it’s likely that IoT devices and automotive systems will follow suit. Maintain the chain of trust at all times.

It is possible to have devices warm boot (resume from FLASH) in a few seconds. Use this to your advantage. But don’t compromise on a secure boot – nobody wants their devices hacked.

Compression

Binaries are compressed to save storage space in FLASH (and download time) but at the expense of decompression time when loading the binary. Decompression times and ratios for binaries vary considerably with the method used (see below).

Method Data size Comp. Ratio Decompress time Uncompressed 130MB 100% 0s Gzip 51MB 43% 10s Bzip2 47MB 39% 20s LZMA 38MB 32% 47s

Above are some times / ratios for a signed Set Top Box image. Compression ratios also vary hugely on the data being compressed: text and binaries compress well because they have lots of repetitive data. A lot of work has been done in this area by other people [4], [5]. Already-compressed files (MP3, JPG, MPEG etc) or files containing random data don’t compress well, if at all, and sometimes you get negative compression trying to compress a file already compressed by a different method (i.e. output file is larger than input!).

If you’re using a multi-core system, there are parallel versions of GZIP, BZIP2 and LZMA.

If you decompress using 2 cores, you won’t decompress in half the time; you might take 2/3 of the time.

If you want the fastest possible boot, don’t compress your runtime image, but instead be prepared to store/download several tens (or even hundreds) of extra megabytes.

Debug code

Printing out debug code takes time: typically 1ms per character. Don’t do it for a production system! It’s a potential security risk – giving out info on what’s in your box, and if you have a terminal session enabled, it’s a back door into your box!

If you must use debug, which you certainly will during development, consider using ‘quiet’ debug to reduce the number of prints and suppress printk during boot, and disable the early printk config option.

Splash Screens

So you’ve reached the point where it’s not expedient to spend any more time reducing your time to boot. The last roll of the device is to use a splash screen.

Once you’ve powered up, as soon as your display is enabled in boot code, show a splash screen.

And whenever something happens, e.g. the kernel or a user space component resets the display, show another splash screen (preferably a different one) as soon as possible. It takes surprisingly little time to blat a number of pixels on a display, and it will stall the user for at least a few seconds, generating the impression that something is happening, when in fact you’re not ready to receive user inputs yet.

Comparison of Linux and Non-Linux Device Boot Times

What is an acceptable boot time for Linux/Android devices? And how do we strike an acceptable balance between boot time (and its effect on the user / desirability of the product), and expending development time and effort to bring the boot time down?

The answer is pretty simple in most cases: try to make it comparable or slightly better than its non-Linux equivalent, unless the time and effort spent doing so is prohibitive.

Iphone 6 Plus 35s Android 5.1 Moto X smart phone 42s PC booting Windows 7 56s to login, 140s to UI, 180s internet connection PC booting Mint 17.2 KDE from USB stick 90s Sky+ HD Box running vxWorks, middleware and EPG from 2008 90s for green light display / live TV Sky+ HD Box running Linux and latest middleware and EPG 180s for green light display / live TV

The boot times above were for the same PC and same Set Top Box platform, and the iPhone and Android phones had very similar Antutu benchmark scores.

Limitations on resources, secure boot, decompression, and availability of network services all have an effect here, which is why so much emphasis is placed on standby suspend/resume for embedded Linux / Android products, cold boots take too long!

Conclusion

It’s worth spending at least a little time optimising your boot time, especially on a production embedded system. Saving boot time is a trade-off with development time.

Timestamp debug is the easiest way of debugging boot time.

Go for the quick wins early and identify the items that take the longest to boot and concentrate on them first.

Remove any unnecessary services and kernel modules, and defer loading of any less-than-critical ones.

Avoid using DHCP and udev.

First Time Boot: Installation – set things up and store in disk or FLASH for a faster boot next time.

Fast Boot: Restore from Disk or FLASH

Use splash screens to distract the user during boot time.

References:

Booting Linux in One Second (Embedded Linux Conf Europe 2015)

https://www.phoronix.com/scan.php?page=news_item&px=Booting-Linux-1-Second

Definition of Boot Terms

http://elinux.org/Boot-up_Time_Definition_Of_Terms

Super-fast 300ms boot

http://www.makelinux.com/emb/fastboot/omap

Compression Measurements

http://binfalse.de/2011/04/04/comparison-of-compression/

Compression Comparison

http://bashitout.com/2009/08/30/Linux-Compression-Comparison-GZIP-vs-BZIP2-vs-LZMA-vs-ZIP-vs-Compress.html