Whether you’re building an object detection algorithm or a semantic segmentation model, it’s vital to have a good dataset. Thanks to continued progress in the field of computer vision, there are several open source aerial image datasets on the Internet. However, it’s not always easy to find the one that could kickstart your project.

At Lionbridge, we know how frustrating it is when you can’t find the training data you need. That’s why we’ve compiled this collection of datasets to get your project off to a good start. From urban satellite image datasets to FPV drone videos, the data below will help you to get your aerial image research off to a good start.

If you like what you see, be sure to check out our other dataset collections for machine learning. They include everything from image datasets to named entity recognition datasets.

Satellite Imagery Datasets

DOTA: A Large-scale Dataset for Object Detection in Aerial Images: The 2800+ images in this collection are annotated using 15 object categories. This dataset is frequently cited in research papers and is updated to reflect changing real-world conditions.

Cars Overhead With Context (COWC): Containing data from 6 different locations, COWC has 32,000+ examples of cars annotated from overhead.

Microsoft Canadian Building Footprints: These satellite images contain over 12 million building footprints covering all Canadian provinces and territories.

NWPU VHR-10 Dataset: This is a dataset of 800 satellite images containing 10 classes of objects for geospatial object detection.

SpaceNet Rio De Janeiro Points of Interest Dataset: SpaceNet’s dataset contains over 120,000 individual points that represent 460 of Rio de Janeiro’s features. It’s intended for use in automating feature extraction.

DSTL Satellite Imagery Feature Detection: Originally designed to automate feature classification in overhead imagery, DSTL’s dataset is comprised of 1km x 1km satellite images. The images have 10 different classes, from roads to small vehicles.

Aerial Orthoimagery Datasets

Inria Aerial Image Labeling Dataset: The Inria dataset has a coverage of 810 square kilometers. It was designed for pixelwise labeling use cases and includes a diverse range of terrain, from densely populated cities to small towns.

Stanford Drone Dataset: This dataset from Stanford contains eight videos of various labeled agents moving through a variety of environments. These agents include cyclists, pedestrians, and cars amongst others.

Aerial Imagery Object Identification Dataset: This dataset contains 25 high-resolution orthoimages covering urban locations in the United States. It contains over 40,000 annotations of building footprints as well as a variety of landscape topology data.

Vertical Aerial Photography: More generally, the UK government has been collecting ortho-rectified aerial imagery since 2006. This dataset is regularly updated and sorted by year of survey.

First Person Vision (FPV) Datasets

DJI Mavic Pro Footage in Switzerland: Consisting of several drone videos, this dataset is intended for use in developing object detection and motion tracking algorithms.

Open Cities AI Challenge: This high-resolution drone imagery dataset includes over 790,000 segmentations of building footprints from 10 cities across Africa.

The Zurich Urban Micro Aerial Vehicle Dataset: This dataset includes video of around 2km of urban streets at a low altitude. It’s designed for a range of topographical mapping use cases.

MMSPG Mini-drone Video Dataset: Built to improve drone-based surveillance, this research dataset contains 38 HD videos. It depicts a range of different types of behavior and contains manual annotations of several different regions of interest.

Okutama-Action: The 43 aerial sequences in the Okutama-Action dataset contain a wide range of challenges for those looking to develop human action detection algorithms.

Still can’t find what you need? At Lionbridge AI, we share your obsession for building the perfect machine learning dataset. Our array of data creation, annotation, and cleaning services are built to suit your specialist requirements. Whether you need hundreds or millions of data points, our team of experts can ensure that your model has a solid ground truth.

Contact us now to discover how we can improve your data.