At Cruise, we are developing self-driving cars through extensive use of powerful simulation frameworks. Though critical in helping us achieve our goals, they pose numerous technical challenges and require careful optimization and tooling for proper analysis. The Simulation Platform team at Cruise is focused on building infrastructure to run simulations efficiently and creating tools to help engineers quickly understand the effects of their code changes on the car’s performance. We’re going to take a closer look at one of the more interesting infrastructural challenges we’ve faced: storing and analyzing massive amounts of simulation data.

Simulation Testing is Hard

We face some unusual challenges in storing simulation data. We produce gigabytes of simulation data per second that need to be available within minutes to allow engineers to perform quick analyses after running tests. We’ve also been dramatically increasing the number of simulations we run, so we need an analytics database that can effortlessly scale.

To understand why there is so much data and why it’s so complex, we should look at what makes simulation testing harder than traditional software regression testing.

First, simulations do not necessarily output binary results. For example, in a given version of the code an x% decrease in one metric may be acceptable if accompanied by a y% increase in another metric. Because of the number of metrics required per sim, each simulation generates upwards of a gigabyte of result data.

Simulation result data is necessarily complex because of the complexity of debugging a self-driving car. There are many interdependencies in the Autonomous Vehicle (AV) stack — an improvement in one part of the stack may have unpredictable effects on downstream systems. For instance, a lidar engineer working on a segmentation model may see improvement in isolation, but there may be unexpected impact on the tracking or prediction systems.

We also want to look at simulation metrics in relation to previous metrics over time. For example, it is difficult to assess whether it’s acceptable for the system to generate some absolute number of spurious vehicle detections on a particular scenario; instead, we would prefer to know whether that number increased or decreased relative to some previous commit.

Finally, engineers may want to experiment with new types of simulation metrics over time to assist with analysis, which places additional constraints on the architecture of the underlying data layer. As an example, an engineer may decide to begin tracking every time two objects are incorrectly classified as one merged object, such as when two parked cars are close to each other, and the tracking system classifies the two objects as a single car. Ideally, this sort of experimentation can happen seamlessly without engineers having to worry about schema migrations for the simulation metrics data warehouse.

Simulation Results Pipeline

The pipeline for Simulation results is the following:

In a typical run, our simulations get scheduled and run through our simulation service. A graph compute engine transforms the raw simulation output into Avro tables (our chosen data serialization protocol). These tables get stored in Google Cloud Storage (GCS) After this upload to GCS is complete, our simulation service notifies a simple ingestion service via pubsub that there are new simulation results to pull. The ingestion service retrieves the latest data, examines the tables for any new added metrics (new columns), makes the necessary changes to the BigQuery schema. The ingestion service then uses the streaming insert API to add the simulation results to BigQuery. At this point, the data is available to various consumers, such as Jupyter notebooks and Business Intelligence (BI) tools like Looker and Tableau.

Flexible Analysis Tooling

One of the more interesting ways we analyze simulation data is through our custom analysis platform, which is often better suited to the particular needs of AV development than BI tools like Looker or Tableau. This visualization tool allows engineers to quickly generate powerful and AV-specific analysis dashboards on top of simulation results.

To understand how we use this system, let’s dive into an example. Here is a common scenario our autonomous vehicles encounter frequently during testing in the complex urban environment of San Francisco: the unprotected left turn.

We are measuring the “Selected Gap” length, which is the temporal distance (time) between when the car enters the intersection and when an oncoming car enters the intersection. Intuitively, we would want to maximize this metric to increase safety.

This is an example of one of the visualization dashboards we created on top of our simulation analysis database. Here, we are comparing the performance of several simulations on a feature branch versus the base branch.

The x-axis is the headway — the temporal distance between oncoming cars — and the y-axis is the car speed. Each cell represents an individual simulation run with slightly different values for these two parameters, each of which generates about a gigabyte of simulation result data.

This tool gives us the high level view of our performance on selected gap length on feature versus base, but also gives us easy drill-down access if we need additional detail. Each cell in this dashboard is clickable and has a separate detailed view for that simulation.

Simulation Infrastructure for an Autonomous Future

This analysis platform and underlying data infrastructure have fundamentally reshaped AV development workflow at Cruise. Now, AV engineering projects always include metric targets and visualizations built with these tools.

The infrastructure itself has scaled admirably; we are able to ingest gigabytes of simulation result data per second using BigQuery. Because this data is streamed in real-time, analysis is possible within minutes, which greatly speeds up the AV development cycle. What’s more, we’ve been able to scale simulation result ingestion by 10x since the launch of this system.

Though they’ve already been instrumental in improving AV workflow, we expect that our infrastructure and analysis tools will continue to play a crucial role in our mission to place a self-driving car on the road safely.

If you’re interested in building infrastructure and tools to simulate millions of miles and accelerate the development of autonomous vehicles, join us. You’ll find us in the Infrastructure department under Engineering Productivity.