Photo by Steve Jurvetson.

Why Tesla, Not Waymo, is the Leader in Self-Driving Car Development

Responding to popular arguments about lidar, disengagements, AI talent, and so on

Neural networks are needed to solve the problems that a self-driving car needs to solve: computer vision, predicting the behaviour of road users, and the planning and execution of driving tasks. More training data makes neural networks perform better. Among self-driving car companies, only Tesla has the capability to train neural networks at the scale of billions of miles. No other company comes close. So, it stands to reason that Tesla will make more progress on self-driving cars than any other company.

When I make this argument, in response I hear a lot of counter-arguments about Waymo. The most common arguments I hear are:

“Self-driving cars need lidar.”

“Waymo is years ahead of Tesla.”

“Google and DeepMind are the world leaders in machine learning, so Waymo is the leader in self-driving cars.”

“Waymo has the lowest rate of disengagements by safety drivers.”

“Waymo is already operating a self-driving taxi service.”

Take a closer look at these arguments, and I think you’ll find compelling reasons to doubt them.

“Self-driving cars need lidar.”

Short response: There are lots of important things that are invisible to lidar and only visible to cameras. With or without lidar, camera-based computer vision has to be great. If it’s great, lidar may not be necessary. The best way to make it great is to ditch lidar and go for scale of training data instead.

Long response: Lidar works by measuring depth, and lots of important features of the driving environment don’t have any depth. Some are paint-based: lane lines, stop lines, turn arrows, and crosswalks. Others are light-based signals: brake lights, turn signals, hazard lights, and traffic lights. Finally, there are signs. To drive safely and well, a self-driving car needs to recognize all these features with high accuracy. Only cameras can see them.

If camera-based computer vision isn’t accurate enough at recognizing depthless features, lidar can’t help. If it is accurate enough, then this calls into question the necessity of lidar. If cameras alone are good enough at recognizing things without depth, what would prevent cameras from also being good enough at recognizing things with depth?

Lidar’s resolution is also too low to see small features, like pedestrians’ facial expressions or their more nuanced forms of body language. Cameras can pick up on these visual cues, which neural networks can use to anticipate pedestrians’ movements. Small obstacles like plastic bags (safe) and cinder blocks (dangerous) may also be indistinguishable to lidar.

Using lidar doesn’t preclude using cameras, but at current lidar prices it does preclude using cameras at scale. Eventually, we’ll have affordable lidar, but eventually isn’t today, and today if you want to put a million camera-equipped cars on the road, you can’t also equip those cars with lidar. (At least not the sort of high-grade lidar used by Waymo.) Having a million cars on the road helps solve the camera-based vision problems by scaling up training data. Once you solve those problems, you don’t need lidar. If you haven’t solved them, lidar won’t help. So, why not maximize your chances of solving camera-based vision?

“Waymo is years ahead of Tesla.”

Short response: The time gap is not as long as it might first appear because Waymo (seemingly) only started using deep neural networks in 2015.

Long response: Deep neural networks are a foundational technology for self-driving cars. Without them, self-driving cars would most likely be impossible. (Even with them, they might prove to be impossible. We’ll see.) Waymo started in 2009, a few years before deep neural networks were popularized in the AI research community. In an alternate history where deep neural networks didn’t take off, it’s possible that Waymo would have never graduated from Google’s “moonshot factory”.

Waymo first tried applying deep neural networks to pedestrian detection in 2015. The first deep neural network-based system performed 100 times better than its previous system. Getting computer vision to surpass human vision is an open challenge as it is. With the older techniques that performed 100 times worse, it doesn’t seem like a tractable problem.

Tesla started applying deep neural networks to computer vision problems no later than 2016. In 2017, Tesla deployed the second generation of Autopilot, which uses Tesla’s own computer vision neural networks. Based on this information, Waymo’s head start on applying deep neural networks to autonomous driving appears to be just one year at most, and possibly zero years. Waymo doesn’t have the seven-year head start that its founding date might seem to imply.

When it comes to applying deep neural networks to actual driving actions, as opposed to computer vision, that work is even more recent than the work on computer vision. DeepMind popularized deep reinforcement learning in 2013 by showing its efficacy on Atari games. In the same way that Waymo’s early computer vision systems probably never would have worked adequately, I personally suspect that a lot of the early work on programming driving actions by hand will have to eventually be thrown out in favour of a neural network-based approach. Researchers and engineers who work on self-driving cars seem to be increasingly warming up to ideas like deep reinforcement learning and deep imitation learning.

“Google and DeepMind are the world leaders in machine learning, so Waymo is the leader in self-driving cars.”

Short response: For Waymo to develop self-driving cars at its small scale, machine learning researchers might need to make new fundamental breakthroughs. By contrast, Tesla’s approach is simply to scale up existing approaches that have proven successful in other applications.

Long response: An Olympic gold medalist is better than the bronze medalist, or the person who finished in fifth place. But these are only incremental differences. Force Usain Bolt to wear ankle weights, and the other runners will sail past him. Being better is only enough to win if the competition is fair.

The competition between Waymo and Tesla isn’t fair. Waymo is trying to solve machine learning problems with scarce data, and Tesla is trying to solve those same problems with abundant data. Solving problems with abundant data is the modus operandi of contemporary machine learning. Solving problems with scarce data is an open research problem. If Google or DeepMind were able to make the requisite breakthroughs to get as much neural network performance out of Waymo’s scarce data as Tesla’s abundant data, the implications would reach beyond self-driving cars to robotics in general, and other applications of machine learning. It would be a big deal, and we would see it.

Since Tesla’s fleet is driving approximately 400 times as many miles per month as Waymo’s fleet, Tesla’s fleet might encounter a situation 1,200 times per year that Waymo’s fleet only encounters three times. Researchers are working on approaches like one-shot imitation learning that may one make it possible for a self-driving car to learn from one example of a situation, or just a handful. For now, it’s still an open problem. On the other hand, we already have proofs of concept like DeepMind’s AlphaStar that show an artificial agent can master complex behaviours via imitation learning and reinforcement learning if the agent trains on a massive number of human demonstrations and a massive amount of trial and error. Training on large-scale data works today. Training on small-scale data might work tomorrow.

It remains an open question whether any scale of data collection — even using every car on Earth — will be sufficient to solve autonomous driving using existing techniques. If Waymo’s scale isn’t enough to solve it, Tesla’s may not either. So, why even draw a distinction between Waymo and Tesla?

We don’t have strong evidence, but we do have a hint. We can compare the scale of data used to solve StarCraft and observe that Tesla’s fleet is in the same ballpark. AlphaStar trained on tens of thousands of years of continuous play; Tesla’s fleet will drive continuously for tens of thousands of years every year. DeepMind trained 300 versions of AlphaStar for 200 years each, totalling 60,000 years. Tesla has 500,000 vehicles driving one hour per day, which comes to 20,000 years per year. Waymo’s fleet, by contrast, is driving less than 500 years per year, and has driven less than 500 years in its entire history.

“Waymo has the lowest rate of disengagements by safety drivers.”

Short response: The rate of disengagements reported to the California DMV is not the total disengagement rate. We also have no Tesla data with which to do a comparison.

Long response: Reid Beer, a Waymo beta tester, says that a human safety driver takes over “once in every five rides or so”. If rides are about 8 miles long on average, then that’s a disengagement about every 40 miles. This is a far cry from the rate of one disengagement per 11,000 miles reported to the California DMV. What gives?

The California DMV doesn’t ask for the total disengagement rate. It asks only for safety-related disengagements: instances where a collision might have occurred if the safety driver hadn’t intervened. Apparently, safety-related disengagements are a tiny percentage of total disengagements.

Employees have to assess whether a disengagement was safety-related, and that’s partly a subjective judgment call. For example, Cruise decided that running a red light wasn’t a safety-related reason to intervene.

Tesla doesn’t do what the California DMV would consider fully autonomous testing on public roads in California, so it doesn’t report any numbers to the California DMV. It’s therefore not possible to compare the safety-related disengagement rates of Waymo and Tesla (even if we could guarantee that employees assess what is safety-related in the same way).

“Waymo is already operating a self-driving taxi service.”

Short response: Taking beta testers along for test driving isn’t in itself a sign of technical capability, and it doesn’t contribute to technical progress.

Long response: Any company that tests prototype self-driving cars on public roads could put non-employees in the back seat. But why? This adds liability, inconvenience, and expense, and it in no way helps a self-driving car learn to see better, predict road users’ behaviour better, or make better driving decisions. Uber took some Pittsburgh passengers for rides in its prototype cars for a limited time. Lots of companies give occasional test rides to non-employees in the back seat. Waymo just gives test rides on the largest scale; that doesn’t necessarily imply Waymo’s technology is the best.