Package management (by John Frey)



A few weeks ago on DistroWatch there was some debate about package managers. I think you will agree that we can never know too much about package management. I have done some research and would like to share some of what I have learned.





What is a package?



The first thing we need to know is, what is a package? There are two ways to install software. The first way is to get source code and compile it on your system.



#./configure

#make

#make install



The 2nd way is to get and install a package. A package contains source code pre-compiled and packaged as a binary installation file (executable). It may include icons, libraries, configuration files, binaries, man pages, desktop shortcuts, header files, fonts, etc. In addition, it may contain meta data, such as version information, package maintainer and software authors' names and contact information, licensing, changelogs, READMEs and web location for the project and source code. Each package format has a file structure for storing data and is compressed. When the package is executed, it uncompresses the data and copies all the files from the package into the file system of the operating system, creating symbolic links where needed, putting start-up links in the menu and on the desktop, and sometimes giving configuration options to the user.



Packages are distribution and version-specific as the location for dependencies may vary between distributions and between versions of a distribution. Sometimes it is possible to download and install software the Windows way by just clicking on the package, provided it is compatible with the operating system. For instance, I downloaded and installed the Flash plugin RPM from Adobe on my Mandriva Linux system. More on that later.



There are many package formats, with .tgz, .deb and .rpm being the most common. Others like .pup, .pisi, .tazpkg and .mo are less common. Source code is usually distributed as tar.gz or tar.bz2 files, but some distributions distribute their binary packages this way as well. Most of us probably use .deb or .rpm packages.







An example of a modern graphical package management tool: PiSi by Pardus Linux

(full image size: 159kB, screen resolution: 961x770 pixels)



Package managers



OK, so we know what a package is, what is a package manager? In a nutshell, a package manager installs, removes and updates packages. That is the simple definition but a modern package manager can do so much more. It can automatically connect to a repository, automatically download software, check for and resolve dependencies, list packages, list dependencies, search the package list, sort the list, and add and remove repositories. It can specify a repository for a specific package and block upgrades to specific packages, verify checksums and digital signatures to insure the integrity of the packages, allow automatic updates, and remove dependencies when uninstalling.



Not all package managers do all of those things, nor do they all perform equally well on all of those functions. This has given rise to different package managers and, contrary to the perceived notion, they do not all perform the same function except as viewed on a very superficial level. As you read about these package managers this will become clearer.





Repositories and package management system



Repositories are collections of packages typically on a remote server, but they can also reside on the local hard disk, a CD-ROM, DVD or other storage media. The important thing to know about repositories is that they store data about packages in a particular format depending on the package manager. As an example, Mandriva's urpmi cannot read Fedora's yum repository despite both containing .rpm packages, while Debian's APT can't read Mandriva's or Fedora's repositories. The package manager, package format and repository comprise the package management system.





pkgtool



Slackware and its derivatives use this system. The package format is a tar.gz file given the .tgz extension. That is, it is a tape archive (tar) that has been compressed with gzip (gz). This is not a package manager in the sense we normally think of it today. It's really just a package format and some command line tools to create, view, install, remove and upgrade. The packaging system allows embedding of install scripts. Those scripts are the only difference between installing a source tarball and a Slackware package. There is no dependency checking, no automatic connection to a repository, no automatic updating the system or checksum verification. Using pkgtool, one can access a list of installed software for removal, install packages or run install scripts. Packages are manually downloaded from a repository. This is the system that all package managers set out to improve in the days when Slackware Linux was a dominant Linux distribution. Slackware provides a package browser on the Internet and RSS feeds are available too. Both of those tools look to be recent additions. SWareT, slapt-get, slackpkg and NetBSD's pkgsrc are third-party tools that have been developed to aid package management in Slackware and/or its derivatives. These tools all provide dependency resolution and may provide some more advanced functions.







Gslapt is a GTK+ front-end to slapt-get, an APT-like package management system for Slackware Linux.

(full image size: 50kB, screen resolution: 795x629 pixels)



Advanced Packaging Tool (APT)



APT is used primarily in Debian and its derivatives. APT is a library of routines in libapt that acts as a front-end for dpkg, which is a low-level package manager with utilities to install, uninstall and update .deb packages. APT provides dpkg with more advanced functions, not the least of which is dependency resolution. The APT of today has evolved a long way from its origins but has retained its relationship to dpkg. All Debian derivatives use APT by default. Development of new capabilities has kept pace with other package managers of more recent vintage. There is little question that it is one of the best, most feature-rich package managers available. APT has been ported to OpenSolaris and Mac OS X, and can be used with RPM-based distributions via apt4rpm or apt-rpm.







Synaptic - a popular graphical package management tool for (not only) Debian-based distributions.

(full image size: 83kB, screen resolution: 856x598 pixels)



RPM Package Manager (RPM)



RPM is both a package format and a package manager. It is easily as popular as APT. While RPM has some of the higher level functions built into it from the start, like dependency checking (but not dependency resolving), it seems that adding all the features of a modern package management system to the RPM standard is not easily done. This has given rise to new package management tools like YUM, urpmi, YaST, up2date and apt-rpm that offer dependency resolution and more advanced features while leaving the lower level routines to RPM. These utilities are sometimes called Meta Package Managers because they manage RPM which is already a package manager. RPM has been ported to IBM's AIX architecture and is the default package format for the Linux Standard Base (LSB).





urpmi



Mandriva is the only distribution using urpmi, just as openSUSE is the only distribution using their system. The package format is .rpm. The urpmi utility is one of the first, perhaps the first, package manager for RPM packages. It actually consists of a number of different utilities to perform various functions: urpme uninstalls software, urpmq queries the database for a matching file name, urpmi installs packages, and so on. One of the interesting functions of urpmi is that it will add the meta data for RPMs installed from a local directory. If you remember way back, I mentioned downloading Adobe's Flash plugin. All I had to do was click on the RPM package, that brought up a dialog box asking me if I wanted to install, save or cancel. I chose install and urpmi added the RPM to my list of installed software. This means that I can use urpmi to uninstall or upgrade the plugin provided I remember to keep the original RPM file.







Mandriva's Rpmdrake is a graphical front-end for the distribution's package management utility called urpmi.

(full image size: 75kB, screen resolution: 953x640 pixels)



Yellow Dog Updater Modified (YUM)



Derived from Yellow Dog Updater (YUP), YUM is a Package manager for Red Hat/Fedora-based systems using the RPM package. It has become the default package manager as of Red Hat Enterprise Linux 5 and is used by most Red Hat/Fedora-based systems. Modularity is a major feature for YUM. Extra functions are added through plugins and with the yum-utils package. Critics say the tool is not integrated enough and performance and maturity of modules can vary. Nevertheless its wide adoption is evidence that it is a good package management system. Red Hat has long offered a subscription service to provide updates and patches called Red Hat Networks (RHN). The subscription service is important to their business plan and as such they have not spent as much time on development of a non-subscription service package manager. Third parties developed YUM before it was picked up by Red Hat. RPM is the traditional Red Hat package manager superceded by up2date, now replaced by YUM.







Yum Extender (YumEx) is a powerful graphical package management tool for Fedora-based distributions.

(full image size: 103kB, screen resolution: 806x623 pixels)



ZYpp



SUSE Linux and openSUSE use a veritable dog's breakfast of utilities for package management. Input is given through either rug (a command-line front-end) or zen-updater (a GUI frontend) to Zenworks Management Daemon (ZMD). ZMD listens for commands and passes them off to the libzypp ZMD helpers, which communicate with the software database, parse metadata, and pass data and commands to libzypp. Libzypp does dependency resolution, installation, removal, and upgrades - using the RPM package management utility. One can also use YaST or zypper (command line) to talk directly to libzypp. This is the extreme other-end from pkgtool used by Slackware. Three front-ends, two package management systems, two repositories. The zen-updater system also adds a daemon and the helper layer (that no other system has) before it reaches the package manager for dependency resolution and installation routines. I used this system when openSUSE first introduced it in version 10.0. It was very slow as many will remember, but recent reports say the speed has been improved markedly.







YaST2 is an openSUSE system administration utility that includes an advanced graphical package manager.

(full image size: 202kB, screen resolution: 1,186x730 pixels)



Source-based distributions and the BSDs



For these systems, a repository contains install scripts instead of pre-compiled binaries and compiling is done on the local machine. Portage in Gentoo uses scripts called ebuilds that link to source code and contain instructions for the compiler and install routines. This system does function like a package manager in many ways, including installing, removing, updating, tracking installed software, dependency resolving, etc. This is generally the way BSD Ports work as well, with installation scripts instead of packages. The appeal of this type of management should be obvious. Install scripts are smaller than packages so the repository is lighter. Original source means possibly cleaner code and less third-party "optimization", though install scripts may contain patches. Local compilation means no extra code to support hardware not on the system and optimizations can be made for the available hardware. Other advantages exist but you can read Gentoo or BSD documentation for that.







DesktopBSD's package manager (dbsd-pkgmgr) is an innovative graphical tool that allows installing both binary and source packages.

(full image size: 72kB, screen resolution: 854x584 pixels)



Summary



Those are some of the package managers out there. I hope you can see the differences between them after reading this. The early package managers were collections of simple install, remove and update routines. While APT was eventually developed to be very feature-rich, third-party package managers for Slackware Linux are mostly simple programs that add dependency resolution and one or two other features. RPM was an evolution of the package manager closer to a complete package management system. It added features like dependency checking, tracking, automatic installation and checksum verification. RPM has many more functions than dpkg but fewer functions than APT. For whatever reasons, instead of adding functions to RPM, developers created "meta package managers" like YUM, urpmi, Smart and YaST. Finally, we have script-based package management that uses original source code and compiles it at install time. Script-based repositories are smaller than package repositories, and scripts possibly require fewer resources to maintain through minor variations of software updates. However, they require close monitoring of the software sources to ensure that hyperlinks in the scripts remain valid.



There are many package managers not described here. They range from extremely feature-rich, like Smart, to slim, like (Arch Linux's) Pacman. They are usually developed when no other package manager fits the bill for a distribution's base and philosophy. Smart and YUM, among others, are being developed to read several types of repositories. We keep seeing new package managers and meta package managers. Even Puppy has its own, undoubtedly optimized for size. Then there are the many, many graphical front-ends. They don't add new functions but improve usability. Personally, I prefer a graphical front-end for browsing packages in much the same way I prefer a graphical user interface for file browsing.



