[Update from Feb 1 2011: I’ve written a new post that adds some technical details.]

Here’s a little thought experiment:

You’re spending the day visiting the Utah computer science department. We’re chatting and I describe a software verification project where my group has proved that some particular version of OpenSSH is memory safe: it will never execute an array bounds violation, null pointer dereference, or similar. Since you are perhaps slightly gullible, you believe me. Now I ask you to look up and for the first time, you notice that a large piano is hanging from my office ceiling, directly above your head. The rope from which the piano is hanging is securely tied off to an I-beam, so there’s no reason to worry about it falling on you accidentally. However, the rope passes through a small box that is connected to my desktop computer using a USB cable. I explain to you that this box contains a very sharp, remotely-activated knife that can instantly sever the rope when commanded to by the computer (don’t worry about this software being wrong, I verified it too). Now, since you believe my verification result, you should be perfectly willing to keep sitting in your chair while I attack OpenSSH using some sort of highly advanced fuzz tool that tries to crash it by generating evil inputs. Of course, I have rigged OpenSSH’s SIGSEGV handler so that if it ever fires, it will command the USB knife to sever the rope — and you will be squished. Do you keep sitting in the chair?

My opinion is that a person would have to be a bit foolish to sit under the piano. But forget about my work: how many verification researchers believe in their own work to this level? Probably not too many. The first problem is one of semantics: the word “verified” is used extremely loosely by members of the verification community (I recently reviewed a paper that kept saying things like “we rigorously prove” and “we formally verify” — despite the fact that the paper contained nothing but proof sketches). The second problem is deeper: it is extremely difficult to argue convincingly that a verification result is correct. By “correct” I mean not only “mathematically sound” but also “the result is faithful to the intent of the verification effort.”

Although the picture I’ve painted about dropping a piano is silly, the situation is realistic. For example, every time you use the transportation system in a developed country, you are gambling that the benefit of using the system outweighs the risk that some embedded computer will malfunction, harming you. Of course, we willingly make this bet every day, but not because the stoplights and stability control systems are verified (they aren’t) but because we assume that if these systems worked well yesterday, they’ll work well today. The adage (supposedly from NASA) is “test what you fly, fly what you test” and it captures a fundamental fact about how we create reliable engineered systems: systems are tested in realistic environments and failures are rigorously studied and the results are fed back into the design of the next iteration of the system. This is not incompatible with formal verification, but it’s also not a point of view that comes out very often in the verification literature.