Summary: This short post talks about Mobileye’s new paper (regarding a formal approach to Autonomous Vehicles safety). It claims that the paper has several issues, but is nevertheless an important start.

Mobileye came out with a paper titled “On a Formal Model of Safe and Scalable Self-driving Cars” (Bloomberg coverage, summary paper, full pdf). Their main contribution is a thoughtful, detailed, formal description of AV safety – do take a look.

I disagree with some of what they say (see below), but I think the paper is important. I especially like their audacity – trying to formally define safety once and for all. Even if it can’t quite be done, I think this is the start of something good.

I was originally going to write a longer post, summarizing the paper and discussing it. However, that got delayed (I do have excellent excuses), and in the meantime the ever-alert Junko Yoshida of EETimes wrote an initial summary / critique of the paper, and then interviewed Phil Koopman (whom I mentioned before) and Missy Cummings. These experts said most of what I wanted to say (and some things I did not think about), so I decided to just quickly summarize that below, and add what I thought was missing.

I agree with the following views: Here are some of the things Koopman and Cummings said (paraphrased, and perhaps mangled, by me):

They start by expressing similar views to what I said above: The paper oversimplifies somewhat, but is an important start.

They emphasize that SW bugs (mentioned but not further covered in the paper) could be a big issue (I’d add that HW logic bugs should also be considered).

They mention that there are many papers about provable robot safety, which need to be reconciled / considered together with the Mobileye paper

They discuss all the complexity created by the need for mutual understanding between AVs and human drivers (see my post about that).

They discuss all kinds of possible surprises and unintended consequences. Koopman mentioned that “It’s hard to believe that lidar and radar failure independence will work out as well as the discussion assumes”.

Koopman also mentioned the possibility of ML systems learning to game the system: “…what if there’s is a loophole due to measurement uncertainty or pessimistic assumptions that need to be made in practice? It’s likely that machine learning systems will find any such loophole and exploit it. And probably we won’t think of them all in advance”. (To me, this is just one more case where it’s the spec bugs that kill you).

Just one more thing: The one point I’d like to emphasize (beyond all of the above) is that I think the Mobileye authors downplay too much the role of verification-via-simulation. For instance, they say:

… Assume we desire an accident probability of 10−9 per hour, and a certain AV system provides only 10−8 probability. Even if we obtain 108 hours of driving, there is constant probability that our validation process will not be able to tell us that the system is dangerous.

Thus, they say (in corollary 1 in the paper, which indeed flows from their lemma 1), that “Any deterministic validation procedure” with a limited number of samples will not be enough to absolutely ensure safety.

Yes, they are right in saying that (and also in saying that “validating that the simulator faithfully represents reality is as hard as validating the policy itself”). Nevertheless, I think there is no way to do the job without edge-case-based, end-to-end, CDV-style verification-via-simulation. This kind of verification is probably the main way to find SW bugs, spec bugs and so on (e.g. see this post about Coverage Driven Verification).

Consider also that without this kind of verification, chip design would still be stuck at the gate (sorry for the bad pun). It does not come with any formal guarantee to make systems safe, but (in the complex, amorphous world as discussed above) it is perhaps the main tool for making it safe.

The paper probably assumes that simulations will be run using the expected distribution, rather than bug-finding distributions. This is probably not enough (see also the discussion of expected vs. unexpected bugs here).

To put it even more bluntly:

I assume that (unfortunately) there will be no way to fully, formally verify AV safety within the next 15 years (more on that here)

Yet not deploying AVs in that time frame would probably be wrong – morally (we can prevent so many accidents) and economically

So we should verify AVs as best we can (using a mix of the best tools), explain all this to regulators and the public (not easy), and agree on the right threshold and rate for AV introduction.

I still like the Mobileye approach: While this post concentrated on what’s missing in the Mobileye paper, I (and apparently the experts interviewed above) tend to agree with much of it, and to consider it an important start. Note also that the authors of the paper know much more than me about AV safety, vision, ML, and in fact just about any relevant topic (with the possible exception of verification).

Finally, I simply liked the paper (see also my review of Mobileye’s previous, related paper). I think some of the notations / concepts they introduced will also be helpful for creating edge-case-based scenarios for verification-via-simulation.

Notes

I’d like to thank Amiram Yehudai, Sankalpo Ghose, Gil Amid, Sandeep Desai and Yael Feldman for commenting on earlier drafts of this post.

[30-Oct-2017 – hopefully fixed EETimes links]