Photo Metadata and Search on MacOS May 25, 2017

This article has been discussed on Hacker News.

Apple’s Photos.app classifies pictures, identifying subjects such as “boat,” and “bicycle,” as well as settings like “cafe,” and “mountains.” It uses this capability to offer vastly better search than before.

Unfortunately these improvements are neither visible to Spotlight, nor available in the Finder. This post documents a method of reconciling Photos.app metadata with filesystem metadata, so that they are indexed by Spotlight.

The Finder, Extended Attributes, and Spotlight.

Quite a bit of old information remains in circulation, about tagging and searching for documents in MacOS. The Finder seems to recognize a few extended attributes, for the purpose of associating tags and comments with files.

To keep things simple, I will present only the method that has worked for me.

The com.apple.metadata:_kMDItemUserTags xattr is used to associate a plist of tags with a file. Spotlight reads this xattr, and indexes its contents. To assign the tags “mountain,” and “alpine” to a file, for instance, you would create a plist:

<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version= "1.0" > <array> <string> mountain </string> <string> alpline </string> </array> </plist>

To associate this $plist with a $file , invoke the xattr command, as shown below.

[ ...] $ xattr -w "com.apple.metadata:_kMDItemUserTags" \ " $plist " " $file "

At this point, you should be able to find the file, via the 🔎 icon, or the mdfind command, as shown below.

[...]$ mdfind tag:mountain /path/to/matching/file /path/to/another/matching/file

Photos.app Metadata: A SQLite Adventure

Photos.app does its accounting with a SQLite database, which we’ll call $photodb , and set thus:

[ ...] $ cd ~ [ ...] $ cd Pictures [ ...] $ cd Photo \ Library.photoslibrary [ ...] $ cd database [ ...] $ photodb = " $PWD /photos.db"

Unfortunately, $photodb is probably open, and locked.

[ ...] $ lsof " $photodb " COMMAND PID USER FD TYPE DEVICE [ ...snip...] photolibr 15432 patrick 4u REG 1,1 [ ...snip...]

The trouble with just killing photolibraryd, is that it will re-spawn, repeatedly. Undoubtedly, launchd can be told to disable photolibraryd, but the approved mechanism wasn’t immediately obvious to me.

Stop the photolibraryd service with launchctl

[ ...] $ launchctl stop photolibraryd

Instead,Initially, I opted for an egregious hack, which you can read about by copying the redacted text.In order to lock its database, photoslibraryd, need to be able to write to it. I simply removed write permisions.

[ ...] $ chmod -w " $photodb " [ ...] $ lsof " $photodb " | awk '{ print $2}' | egrep -v PID | xargs kill

Afterwards–but not yet–don’t forget to restore writepermissions to your photos database! You may want to restart your computer to ensure that photolibraryd, continues to work.

Now, it’s possible to open $photodb , and poke around.

[ ...] $ sqlite3 " $photodb " SQLite version 3.14.0 2016-07-26 15:17:14 Enter ".help" for usage hints. sqlite>

Review the schema with the .schema command. Quit by sending Ctrl-d , or with the .quit command. Tags (and a lot else) are kept in the RKVersion_stringNote table.

sqlite > . headers on sqlite > SELECT * FROM RKVersion_stringNote ;

We need to find the appropriate value of keyPath , since this will vary between systems. The following snippet, should suffice to find it:

[ ...] $ KEYPATH = $( sqlite3 " $photodb " .schema | grep 'RKVersion_stringNote_skIndexUpdateTrigger' | grep -Eo '[0-9]{1,4}' )

Okay, but did it work?

[...] echo $KEYPATH 719

Now we can find the strings containing our tags, and associate them with filesystem paths. Be sure to substitute the value of $KEYPATH determined above, for the literal 719 , below.

sqlite > SELECT RKMaster . imagePath , RKVersion_stringNote . value FROM RKVersion_stringNote INNER JOIN RKMaster ON RKVersion_stringNote . attachedToId = RKMaster . modelId WHERE RKVersion_stringNote . keyPath = 719 /* !!! */

Records will resemble the following:

2015/04/25/20150425-035012/DSC00435.JPG|DSC00435.JPG \ 00435.JPG JPG October 2012 Outdoor Outside Outdoors \ Outsides Land Lands Mountain Mounts Peak Sierra \ Sierras Peaks Mountains Mount

Tags aren’t quoted, but always come last. The trouble is that its not obvious which are 1-grams, 2-grams, or n-grams. Various collisions are possible both between tags, and other substrings.

Reconciling Photos.app metadata with filesystem metadata

Okay, lets put the pieces together. We need four things:

a copy of photos.db in a write-able location. the system-specific value of $KEYPATH the path to our photos library $PHOTOLIB photos2spotlight.py (listing here).

Start by getting an idea of what your library contains:

[ ...] $ ./photos2spotlight.py --stats \ --db /path/to/copy/of/photos.db --lib "~/Pictures/Photos \ Library.photoslibrary/" --keypath $KEYPATH

[...snip...] 16 Duds 16 Clothing 16 Accoutrements 16 Accoutrement 16 Clothings 16 Apparels 18 Insides 18 Interior Rooms 18 Inside 18 Interior Room 18 Indoors

By default, photos2spotlight.py will make a dry run. Use the --write flag to modify fileystem metadata.

[ ...] $ ./photos2spotlight.py --stats \ --db /path/to/copy/of/photos.db --lib "~/Pictures/Photos \ Library.photoslibrary/" --keypath $KEYPATH --write

You should now be able to find photos using Spotlight, the Finder, or the related mdfind command. E.g.

[...]$ mdfind 'tag:Inside' .../Masters/2015/04/22/20150422-011045/IMG_0083.JPG [...snip...]

Conclusion

That’s it. Macs are easy, right?

Continue to Part 2.