digiKam nearing its 2.0 release

Benefits for LWN subscribers The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

DigiKam has always stood out among the Linux photography tools because it incorporates the features of what are often two separate tools: the photo manager and the raw image editor. When the KIPI plug-in API is added into that list, the feature set can grow quite large. DigiKam announced the first release candidate for version 2.0.0 recently and, as one might expect, a host of new features dominates the new builds.

DigiKam is a KDE program, but apart from dependencies on core KDE and Qt libraries, operates in stand-alone mode, so is perfectly usable in GNOME or other desktop environments. For example, the application's photo collection management features take center stage: it manages large image collections and a wealth of metadata about each entry. But it maintains this information in its own application-specific database (user-configurable for either MySQL or SQLite, and with a built-in migration tool lest one decide to change), rather than relying on Tracker, Zeitgeist, or another external indexer. This approach also allows digiKam to keep an eye on multiple discrete directory locations, rather than requiring you to move your library to a central location (or copying it to a separate location for you) as many of its competitors do.

The database-backed collections framework gives digiKam powerful search capabilities. It is aware of IPTC, EXIF, XMP, and Makernote metadata tags, plus user-defined tags, labels, and ratings, geolocation information, filesystem data (file size, modification time, etc.), and more. The search tools enable you to drill down into large collections with compound queries. DigiKam even allows you to create multiple "metadata templates" that are pre-filled with frequently used information. Since 21st Century Laziness is often the primary reason users do not make use of the metadata formats and ontologies available to them, this helps keep the collection organized.

Much of Digikam's functionality is implemented through KIPI plug-ins. The KIPI API is shared with other KDE-based image programs, such as GwenView and KPhotoAlbum. The official Digikam packages use the plug-ins to implement many of the export and display functions, plus auxiliary functions such as DNG conversion. Often, new functionality is first implemented as a plug-in, such as support for editing a new type of metadata.

The 2.0.0 release candidate code can be downloaded from the project's SourceForge page. There, a source code bundle and a 32-bit Windows installer are available. Linux users wishing to test binary packages will need to find a distribution-specific build provided by a downstream maintainer — digiKam maintains a list of known packages, but does not currently release its own. FreeBSD and Mac OS X builds are also available from third parties.

Image organization

On the image management side of the application, there are a half-dozen or so new features in this release, several of which are the result of Google Summer of Code projects integrated into the main code base during a sprint this spring. The first is XMP sidecar support. XMP sidecars are metadata files that are associated with image formats that cannot store metadata internally. The sidecar files typically retain the base of the original filename, but use the .xmp extension.

As mentioned earlier, digiKam supports its own local metadata, such as user-assigned tags and ratings. The 2.0.0 series adds a pair of new label types: "color labels" and "pick labels." The pick labels appear as red, yellow, and green flags, and their meaning is described in tooltips as "rejected," "pending," and "accepted," respectively. Color labels are visible as a colored highlight around the thumbnail in the image browser. The colors available include the six basic primary and secondary colors, plus black, white, and gray, and there is no pre-defined semantic meaning assigned to any of them.

Considering that digiKam already gives users a wealth of other ways to sort and mark up collections (tags, star ratings, albums), it might seem odd to add more. But I think it is helpful to have multiple, orthogonal ways to mark up a collection, simply to sift through it on multiple factors — particularly when the sorting process may involve transitory issues not suitable for the assignment of a persistent tag. Consider trying to find the "best" image to accompany a particular blog post. Star ratings might reflect overall picture quality, which would leave the color labels open to use in some other part of the decision (such as illustrating different parts of the story). Thus sorted, the picks might come in handy for another user (e.g., an editor) to select among the alternatives. Attempting to do the same thing with star ratings alone or with tags would get confusing.

In addition to the new sorting dimensions, 2.0.0.-RC introduces keyboard shortcuts for assigning common tags, and it allows the user to select and "group" images in the thumbnail browser. Groups of images seem to operate much like a multi-item selection, in the sense that the user can apply changes to the entire group simultaneously, but they do not disappear with a stray mouse click. The tag-assigning keyboard shortcuts are entirely user-configurable, provided that one does not choose a key combination also captured by the window manager or another system component.

The so-called "reverse geocoding" feature is also new. This allows the user to look up human-readable place names to associate with latitude and longitude coordinates typically assigned automatically by GPS tagging software. The upshot is simply metadata that is easier to browse and easier to search.

Technical and editing changes

Sorting is not the only area of improvement in this release, however. Several new technical features make their debut as well, starting with face recognition (yes, the facial recognition data can be searched on, but it constitutes a substantially new feature in its own right). Users can add "face tags" in two ways: either by drawing rectangles on faces in individual images, or by allowing digiKam to scan the entire image collection and automatically mark what it determines to be faces.

At the moment, the documentation of the feature is scant, but the workflow seems to involve marking as many faces as you can stand to manually, adding a name for each. The names are converted to "People tags" in the general tag database. Upon a blind scan-and-identify run, digiKam will compare the unknown faces to the already tagged-and-labeled specimens. Obviously, the higher the percentage of your suspects you tag, the easier digiKam will recognize them in the future.

In the image editing arena, this release of digiKam uses an updated version of the LibRaw library (0.13.5), which adds a few noteworthy features of its own. LibRaw began as an attempt to massage Dave Coffin's dcraw utility into an API-stable shared library usable by other applications. This release, however, also imports several new advanced raw decoding options originally found in the RawTherapee application. Owners of Sigma DSLRs will also be happy to learn that the more recent version of LibRaw includes support for their cameras' Foveon sensors. The Foveon uses a three-layer light sensor that captures RGB data at the same grid location, as opposed to the matrix of single-color detectors found in most other cameras. As a result, entirely different decoding mechanisms are required. Canon is also reportedly working on a 3-layer sensor, so LibRaw and digiKam support for the decoding algorithms is important news.

DigiKam has also added support for file versioning in the editor component. As with all raw photo editors, the editing process is non-destructive to the original image, but most applications do not easily allow the user to save multiple versions of the "edit list" file. digiKam's editor component allows you to view the version history as a flat list (similar to the history pane in GIMP), or as a tree that preserves individual branches created when you roll back and make different edits. DigiKam's editing capabilities lie somewhere in between the color- and exposure-adjustment-only functions found in a typical raw converter and those of a full-blown raster editor. There are a few filter effects and simple touch-up tools (such as a red-eye corrector), but it also allows you to open any image in an external editor application from the right-click context menu.

Finally, digiKam has always supported easy export of images to devices and other applications, and this release adds two: the Czech web service RajCe, and MediaWiki. The MediaWiki exporter is compatible with Wikimedia properties (including Wikipedia) that require authenticated user accounts to upload content.

Focus

By and large, the new additions to digiKam are welcome. Most, such as pick and color tags, keyboard shortcuts, or reverse geocoding, are designed to make searching and managing your images a simpler and more intuitive task. A few of the new features, however, I still find difficult to use.

The face recognition process, for example, is awkward. Drawing rectangles over people's faces is simple enough, but the pop-up window that appears once you do so is unhelpful: it pre-fills the top line with "Unknown," which you might expect to leave the newly-marked face in a blank state, but instead creates a People Tag named "Unknown." The other two buttons on the pop-up window are "Confirm" and "Remove" — but Confirm appears only to remove the unknown face from the set of faces to be scanned by the recognition software. Add to that the fact that by default all of the face tags are invisible to the eye, and you have a confusing user experience. Perhaps the documentation will improve on the situation.

Speaking of awkwardness, I have always disliked vertical side-tabs in GUIs, in any application. At best they are difficult-to-read labels, and at worst they make it unclear which portions of the UI belong to the "tab" and which do not. That problem goes double for interfaces that feature a set of vertical tabs on the left hand side and a separate set on the right.

For horizontally-written languages, vertical tabs give you sideways text labels (running in two different directions depending on whether they are stacked on the left- or right-hand edge), and applications that use them invariably also use normal horizontal menus and toolbars across the top of the window, introducing ambiguity as to which edge of the pane controls its contents. digiKam inflicts this on you, plus it provides no text labels for the un-selected tabs, forcing you to hover the cursor over them to discern the meaning of the cruelly-tiny icons. I suppose the only good thing about this UI design is that it is a clear sign that the application is filled to the brim with features. Still, I wouldn't shed any tears if it went away.

Apart from the interface woes, though, I found the digiKam 2.0.0 release candidate remarkably stable and fast. As always, it scores points for managing sizable collections of images and for providing a myriad of ways to arrange and edit content as the situation dictates. The final release of 2.0.0 is slated for "late July," so it should be a short wait for what looks to be a great update.