Self-driving cars won’t learn to drive well if they only copy human behaviour, according to Waymo.

Sure, people behind the wheel are more susceptible to things like road rage or the urge to run red lights but they’re pretty good drivers overall. When researchers from Waymo, the self-driving arm under Google’s parent company Alphabet, trained an AI system on data taken from 60 days of real, continual driving, they found it couldn’t cope with more complex scenarios that weren’t previously seen before.

“We propose exposing the learner to synthesized data in the form of perturbations to the expert’s driving, which creates interesting situations such as collisions and/or going off the road,” the researchers wrote in a paper published on arXiv.

In other words, people generally don’t crash or veer off track enough for machines to learn from such mistakes. Neural networks are notoriously data hungry; it takes them millions of demonstrations in order to learn a specific task.

The training data, equivalent to 30 million examples, didn’t cover a wide range of scenarios at a high enough frequency to teach computers how to drive properly. The model would often get stuck behind other cars parked on the side of the road or, worse, crash into them.

“The model learns to respond properly to traffic controls such as stop signs and traffic lights. However, deviations such as introducing perturbations to the trajectory or putting it in near-collision situations cause it to behave poorly, because even when trained with large amounts of data, it may have never seen these exact situations during training,” explained Mayank Bansal and Abhijit Ogale, both Waymo researchers.

There are benefits to copying human behaviour in machine learning, a technique known as imitation learning. Human drivers are able to reason and adapt to new environments, whereas robot operators are more likely to be jittery, stopping and starting frequently when faced with unfamiliarities and potential dangers.

So, the researchers decided to tweak the training data a little bit in order “to simulate the bad rather than just imitate the good.” The start and end of a particular route, taken from the original training data, are the same but mishaps like sudden changes of the model’s position on the road are added in between.

They used these new scenarios to teach a recurrent neural network (RNN), dubbed ChauffeurNet, how to drive and then ran the model on a real car.

ChauffeurNet and the struggles of deep learning

ChauffeurNet is split into two main components: FeatureNet and AgentRNN. FeatureNet is a convolutional neural network that processes the input data and extracts important information such as the roadmap, traffic lights, speed limit, the model’s route, the current and past position of other cars. All of this is shared with AgentRNN to spit out an output predicting the future positions of the other cars, so that ChauffeurNet can react accordingly.

The outputs from ChauffeurNet, which describe the path that the model should take, are forwarded to a low-level controller. This converts the data to explicit control commands that tells the model which direction to steer, or how much to accelerate, or when to brake, so that it can execute the operations in simulation or on real roads.

To become smarter by imitation learning, the bot car copies the same path taken by the vehicle in the training data. The manual perturbations added by the researcher act as “imitation losses,” forcing the model into situations that venture far from ones seen in the training examples. ChauffeurNet learns to recover from potentially dangerous situations by minimizing these imitation losses, it begins to avoid collisions and sticks to lane markings.

Waymo researchers ran ChauffeurNet on a Chrysler Pacifica minivan on its private test track to mimic some of the scenarios faced in simulation. You can see examples where the minivan sticks to its lane and behaves appropriately at stop signs here (scroll down to Real World Driving with model M4).

Using neural networks and deep learning for self driving cars is a start, but it’s not enough to build a vehicle that’s smart enough yet, the researchers explained. So, ChauffeurNet won’t be rolled out anytime soon.

“Fully autonomous driving systems need to be able to handle the long tail of situations that occur in the real world. While deep learning has enjoyed considerable success in many applications, handling situations with scarce training data remains an open problem,” the researchers wrote.

Waymo's revolutionary driverless robo-taxi service launches in America... with drivers READ MORE

Furthermore, deep learning identifies correlations in the training data, but it arguably cannot build causal models by purely observing correlations, and without having the ability to actively test counterfactuals in simulation.”

“Knowing why an expert driver behaved the way they did and what they were reacting to is critical to building a causal model of driving," they said. "For this reason, simply having a large number of expert demonstrations to imitate is not enough. Understanding the why makes it easier to know how to improve such a system, which is particularly important for safety-critical applications.”

Waymo’s current system, known as the “Waymo planner,” uses some machine learning but is mostly rule-based. But the researchers believe that a “completely machine-learned system” will be possible one day.

“Components from such a system can be used within the Waymo planner, or can be used to create more realistic “smart agents” during simulated testing of the planner,” they concluded. ®