Google today announced the release of a new and improved landmark recognition dataset. Google-Landmarks-v2 includes over 5 million images, doubling the number in the landmark recognition dataset the tech giant released last year. The dataset now covers more than 200 thousand different landmarks, a seven times increase over the first version.

The new dataset accommodates a couple of advancements in research: instance-level recognition and image retrieval. Instance-level recognition finds specific instances of objects to distinguish for example Toronto Union Station from other urban train stations. Image retrieval meanwhile matches a particular object in an input image to all other instances of the object in a catalogue of reference images.

Geographic distribution of landmarks in last year’s dataset

Landmark locations heatmap in Google-Landmarks-v2 showing increased dataset scale and improved geographic coverage.

Landmark recognition presents unique characteristics in object recognition. For example, even in a large annotated dataset, there might not be as much training data for less-famous landmarks. Naturally, landmarks’ locations are fixed and their appearances do not change dramatically across different images apart from occlusions, point-of-view, weather, and illumination. This small intraclass variation is similar to other instance-level recognition problems such as artwork recognition.

How to most effectively annotate instance labels on the new dataset was a challenge for Google researchers, who leveraged amateur and vacation photographers from all around the world and their attraction to picturesque landmarks. Google crowdsourced landmark labeling to the community, so photographers familiar with particular landmarks could label them accordingly. Researchers also sourced through Wikimedia Commons and public institutions, whose images can be shared freely and stored indefinitely.

Google is also open-sourcing code for Detect-to-Retrieve, a method that leverages bounding boxes from an object detection model to give extra weight to image regions containing a class of interest, which Google says significantly improves accuracy. The model was trained on a subset of 86k images from the previous version of the Google Landmarks dataset.

Google will present a related paper at CVPR 2019 this June in Long Beach, California. Researchers and machine learning enthusiasts can get involved by participating in the Kaggle challenges: Landmark Recognition 2019 and Landmark Retrieval 2019.