Ever wondered what a .deb file actually is? How is it put together, and what's inside it, besides the data that is installed to your system when you install the package? Today we're going to break out our sysadmin's toolbox and find out. (While we could just turn to deb(5) , that would ruin the fun.) You'll need a Debian-based system to play along. Ubuntu and other derivatives should work just fine.

Finding a file to look at

/var/cache/apt/archives/

spang@sencha:~> cd /var/cache/apt/archives spang@sencha:/var/cache/apt/archives> spang@sencha:/var/cache/apt/archives> ls apache2-utils_2.2.16-2_amd64.deb app-install-data_2010.08.21_all.deb apt_0.8.0_amd64.deb apt_0.8.5_amd64.deb aptitude_0.6.3-3.1_amd64.deb ...

nano

Whenever APT downloads a package to install, it saves it in a package cache, located in. We can poke around in this directory to find a package to look at., the text editor, ought to be a simple package. Let's take a look at that one.

spang@sencha:/var/cache/apt/archives> cp nano_2.2.5-1_amd64.deb ~/tmp/blog spang@sencha:/var/cache/apt/archives> cd ~/tmp/blogapt debian dpkg package-management

Digging in

Let's see what we can figure out about this file. The file command is a nifty tool that tries to figure out what kind of data a file contains.

spang@sencha:~/tmp/blog> file --raw --keep-going nano_2.2.5-1_amd64.deb nano_2.2.5-1_amd64.deb: Debian binary package (format 2.0) - current ar archive - archive file

file

--keep-going

Hmm, so, which identifies filetypes by performing tests on them (rather than by looking at the file extension or something else cosmetic), must have a special test that identifies Debian packages. Since we passed the command theoption, though, it continued on to find other tests that match against the file, which is useful because these later matches are less specific, and in our case they tell us what a "Debian binary package" actually is under the hood—an "ar" archive!

Aside: a little bit of history

Back in the day, in 1995 and before, Debian packages used to use their own ad-hoc archive format. These days, you can find that old format documented in file tells us that these debs are different; it doesn't know how to identify them in a more specific way than "a bunch of bits":

Back in the day, in 1995 and before, Debian packages used to use their own ad-hoc archive format. These days, you can find that old format documented in deb-old(5) . The new format was added to be "saner and more extensible" than the original. You can still find binaries in the old format on archive.debian.org . You'll see thattells us that these debs are different; it doesn't know how to identify them in a more specific way than "a bunch of bits": spang@sencha:~/tmp/blog> file --raw --keep-going adduser-1.94-1.deb adduser-1.94-1.deb: data

ar

Now we can crack open the deb using theutility to see what's inside.

Inside the box

ar

x

v

takes an operation code and modifier flags and the archive to act upon as its arguments. Theoperation tells it to extract files, and themodifier tells it to be verbose.

spang@sencha:~/tmp/blog> ar vx nano_2.2.5-1_amd64.deb x - debian-binary x - control.tar.gz x - data.tar.gz

debian-binary

spang@sencha:~/tmp/blog> cat debian-binary 2.0

file's

So, we have three files.This is just the version number of the binary package format being used, so tools know what they're dealing with and can modify their behaviour accordingly. One oftests uses the string in this file to add the package format to its output, as we saw earlier.

control.tar.gz

spang@sencha:~/tmp/blog> tar xzvf control.tar.gz ./ ./postinst ./control ./conffiles ./prerm ./postrm ./preinst ./md5sums

These control files are used by the tools that work with the package and install it to the system—mostly dpkg

control

spang@sencha:~/tmp/blog> cat control Package: nano Version: 2.2.5-1 Architecture: amd64 Maintainer: Jordi Mallach Installed-Size: 1824 Depends: libc6 (>= 2.3.4), libncursesw5 (>= 5.7+20100313), dpkg (>= 1.15.4) | install-info Suggests: spell Conflicts: pico Breaks: alpine-pico (<= 2.00+dfsg-5) Replaces: pico Provides: editor Section: editors Priority: important Homepage: http://www.nano-editor.org/ Description: small, friendly text editor inspired by Pico GNU nano is an easy-to-use text editor originally designed as a replacement for Pico, the ncurses-based editor from the non-free mailer package Pine (itself now available under the Apache License as Alpine). . However, nano also implements many features missing in pico, including: - feature toggles; - interactive search and replace (with regular expression support); - go to line (and column) command; - auto-indentation and color syntax-highlighting; - filename tab-completion and support for multiple buffers; - full internationalization support.

its name

its version number

binary-specific information: which architecture it was built for, and how many bytes it takes up after it is installed

its relationship to other packages (on the Depends, Suggests, Conflicts, Breaks, and Replaces lines)

the person who is responsible for this package in Debian (the "maintainer")

How the package is categorized in Debian as a whole: nano is in the "editors" section. A complete list of archive sections can be found here.

is in the "editors" section. A complete list of archive sections can be found here. A "priority" rating. "Important" means that the package "should be found on any Unix-like system". You'd be hard-pressed to find a Debian system without nano .

. a homepage

a description which should provide enough information for an interested user to figure out whether or not she wants to install the package

nano

nano

editor

This file contains a lot of important metadata about the package. In this case, we have:One line that takes a bit more explanation is the "Provides:" line. This means that, when installed, will not only count as having thepackage installed, but also as thepackage, which doesn't really exist—it is only provided by other packages. This way other packages which need a text editor can depend on "editor" and not have to worry about the fact that there are many different sufficient choices available.

You can get most of this same information for installed packages and packages from your configured package repositories using the command aptitude show <packagename> , or dpkg --status <packagename> if the package is installed.

postinst, prerm, postrm, preinst

These files are maintainer scripts. If you take a look at one, you'll see that it's just a shell script that is run at some point during the [un]installation process.

spang@sencha:~/tmp/blog> cat preinst #!/bin/sh set -e if [ "$1" = "upgrade" ]; then if dpkg --compare-versions "$2" lt 1.2.4-2; then if [ ! -e /usr/man ]; then ln -s /usr/share/man /usr/man update-alternatives --remove editor /usr/bin/nano || RET=$? rm /usr/man if [ -n "$RET" ]; then exit $RET fi else update-alternatives --remove editor /usr/bin/nano fi fi fi

More on the nitty-gritty of maintainer scripts can be found here

conffiles

spang@sencha:~/tmp/blog> cat conffiles /etc/nanorc

/etc

dpkg

Any configuration files for the package, generally found in, are listed here, so thatknows when to not blindly overwrite any local configuration changes you've made when upgrading the package.

md5sums

dpkg

This file contains checksums of each of the data files in the package socan make sure they weren't corrupted or tampered with.

data.tar.gz

/

spang@sencha:~/tmp/blog> tar xzvf data.tar.gz ./ ./bin/ ./bin/nano ./usr/ ./usr/bin/ ./usr/share/ ./usr/share/doc/ ./usr/share/doc/nano/ ./usr/share/doc/nano/examples/ ./usr/share/doc/nano/examples/nanorc.sample.gz ./usr/share/doc/nano/THANKS ./usr/share/doc/nano/changelog.gz ./usr/share/doc/nano/BUGS.gz ./usr/share/doc/nano/TODO.gz ./usr/share/doc/nano/NEWS.gz ./usr/share/doc/nano/changelog.Debian.gz [...] ./etc/ ./etc/nanorc ./bin/rnano ./usr/bin/nano

Finishing up

ar

tar

gzip

Here are the actual data files that will be added to your system'swhen the package is installed.That's it! That's all there is inside a Debian package. Of course, no one building a package for Debian-based systems would do the reverse of what we just did, using raw tools like, and. Debian packages use a make -based build system, and learning how to build them using all the tools that have been developed for this purpose is a topic for another time. If you're interested, the new maintainer's guide is a decent place to start.

And next time, if you need to take a look inside a .deb again, use the dpkg-deb utility:

spang@sencha:~/tmp/blog> dpkg-deb --extract nano_2.2.5-1_amd64.deb datafiles spang@sencha:~/tmp/blog> dpkg-deb --control nano_2.2.5-1_amd64.deb controlfiles spang@sencha:~/tmp/blog> dpkg-deb --info nano_2.2.5-1_amd64.deb new debian package, version 2.0. size 566450 bytes: control archive= 3569 bytes. 12 bytes, 1 lines conffiles 1010 bytes, 26 lines control 5313 bytes, 80 lines md5sums 582 bytes, 19 lines * postinst #!/bin/sh 160 bytes, 5 lines * postrm #!/bin/sh 379 bytes, 20 lines * preinst #!/bin/sh 153 bytes, 10 lines * prerm #!/bin/sh Package: nano Version: 2.2.5-1 Architecture: amd64 Maintainer: Jordi Mallach Installed-Size: 1824 Depends: libc6 (>= 2.3.4), libncursesw5 (>= 5.7+20100313), dpkg (>= 1.15.4) | install-info Suggests: spell Conflicts: pico Breaks: alpine-pico (<= 2.00+dfsg-5) Replaces: pico Provides: editor Section: editors Priority: important Homepage: http://www.nano-editor.org/ Description: small, friendly text editor inspired by Pico GNU nano is an easy-to-use text editor originally designed as a replacement for Pico, the ncurses-based editor from the non-free mailer package Pine (itself now available under the Apache License as Alpine). . However, nano also implements many features missing in pico, including: - feature toggles; - interactive search and replace (with regular expression support); - go to line (and column) command; - auto-indentation and color syntax-highlighting;apt debian dpkg package-management - filename tab-completion and support for multiple buffers; - full internationalization support.

If the package format ever changes again, dpkg-deb will too, and you won't even need to notice.

~spang

Ksplice is hiring!

Do you love tinkering with, exploring, and debugging Linux systems? Does writing Python clones of your favorite childhood computer games sound like a fun weekend project? Have you ever told a joke whose punch line was a git command?

Join Ksplice and work on technology that most people will tell you is impossible: updating the Linux kernel while it is running.

Help us develop the software and infrastructure to bring rebootless kernel updates to Linux, as well as new operating system kernels and other parts of the software stack. We're hiring backend, frontend, and kernel engineers. Say hello at jobs@ksplice.com!