Who should be liable for robot misbehavior? Watch Now

The University of California, Berkeley has released a vast dataset used by engineers to develop self-driving car technologies.

The academic institution's dataset, which can be downloaded here, is part of the university's DeepDrive project.

The dataset contains over 100,000 video sequences which have been recorded to represent different driving scenarios including weather conditions, various environments, and times of the day.

Self-driving cars: A level-by-level explainer of autonomous vehicles (CNET Roadshow)

The video sequences, recorded in HD, also contain GPS locations, IMU data, and timestamps across 1100 hours.

UC Berkeley's BDD100K database can be used by engineers and developers of self-driving car technologies to train autonomous systems.

These types of datasets are required to teach systems how to cope with different environments and driving conditions, including how to detect a road surface in comparison to pedestrian areas, objects such as other vehicles, and potential hazards.

How driverless cars will be the first robots we learn to trust (TechRepublic)

UC Berkeley

× deepdrive1.png

Classification can take countless hours and so to boost object mapping, the database already contains 2D bounding boxes which have annotated over 100,000 images containing objects of note, including traffic signs, people, bicycles, other vehicles, trains, and traffic lights.

In addition, 100,000 images contain notes for vehicles to make complicated driving decisions, such as at busy intersections, cluttered road systems, or where multiple lane markings are present.

See also: Self-driving car owners could become the traffic elite through Hyperlanes

The video clips are approximately 40 seconds long at 30 frames per second and use a variety of methods to annotate objects, according to a paper describing the dataset project (.PDF).

"To achieve rich annotation at scale, we found that existing tooling was insufficient, and therefore develop novel schemes to annotate driving data more efficiently and flexibly than previous methods," the researchers behind the project said. "Current tools are difficult to deploy at scale and are rarely extensible to new tasks or data-structures."

This is not the only autonomous vehicle dataset to be released to the public. In March, Baidu released Apollo Scape, a dataset based on Baidu's autonomous driving platform Apollo.

The open-source database is not as large as UC Berkeley's, but the more data that self-driving technology developers can get their hands on, the smarter our vehicles will become.

Previous and related coverage