Belhumeur and his colleague, Computer Science Professor David Jacobs of the University of Maryland, realized that many of the techniques they have developed for face recognition, in work spanning more than a decade, could also be applied to automatic species identification. State-of-the-art face recognition algorithms rely on methods that find correspondences between comparable parts of different faces, so that, for example, a nose is compared to a nose, and an eye to an eye. Birdsnap works the same way, detecting the parts of a bird so that it can examine the visual similarity of its comparable parts (each species is labeled through the location of 17 parts). It automatically discovers visually similar species and makes visual suggestions for how they can be distinguished.

“Our goal is to use computer vision and artificial intelligence to create a digital field guide that will help people learn to recognize birds,” says Computer Science Professor Peter Belhumeur, who launched Leafsnap , a similar electronic field guide for trees, with colleagues two years ago. “We’ve been able to take an incredible collection of data—thousands of photos of birds—and use technology to organize the data in a useful and fun way.”

“Categorization is one of the fundamental problems of computer vision,” says Thomas Berg, a Columbia Engineering computer science PhD candidate who works closely with Belhumeur. “Recently, there’s been a lot of progress in fine-grained visual categorization, the recognition of—and distinguishing between—categories that look very similar. What's really exciting about Birdsnap is that not only does it do well at identifying species, but it can also identify which parts of the bird the algorithm uses to identify each species. Birdsnap then automatically annotates images of the bird to show these distinctive parts—birders call them 'field marks'—so the user can learn what to look for.”

The team designed what they call “part-based one-vs-one features,” or POOFs, each of which classifies birds of just two species, based on a small part of the body of the bird. The system builds hundreds of POOFs for each pair of species, each based on a different part of the bird, and chooses the parts used by the most accurate POOFs as field marks. Birdsnap also uses POOFs for identification of uploaded images.

The team also took advantage of the fact that modern cameras, especially those on phones, embed the date and location in their images and used that information to improve classification accuracy. Not only did they come up with a fully automatic method to teach users how to identify visually similar species, but they also designed a system that can pinpoint which birds are arriving, departing, or migrating. “You can ID birds in the U.S. wherever you are at any time of year,” Berg notes.

The Leafsnap app, which involved costly time and resources spent in collecting and photographing thousands of leaves, took almost 10 years to develop and now has more than a million users. Belhumeur got Birdsnap going in about six months, thanks to the proliferation of online data sources and advances in computer vision and mobile computing. Photos were downloaded from the Internet, with species labels confirmed by workers on Amazon Mechanical Turk, who also labeled the parts. Descriptions were sourced through Wikipedia. The maps were based on data from eBird, a joint venture of Cornell University’s Lab of Ornithology and the National Audubon Society, and BirdLife, an international network of conservation groups.

Belhumeur hopes next to work with Columbia Engineering colleagues on adding the ability to recognize bird songs, bringing audio and visual recognition together. He also wants to create "smart" binoculars that use this artificial intelligence technology to identify and tag species within the field of view.

“Biological domains—whether trees, dogs, or birds—where taxonomy dictates a clear set of subcategories, are wonderfully well-suited to the problem of fine-grained visual categorization,” Belhumeur observes. “With all the advances in computer vision and information collection, it’s an exciting time to be immersed in visual recognition and big data.”

This research was funded by the National Science Foundation, the Gordon and Betty Moore Foundation, and the Office of Naval Research.