Discovery Probability Quantifies Residual Risk¶

Alright. You have gotten a hold of a couple of powerful machines and used them to fuzz a software system for several months without finding any vulnerabilities. Is the system vulnerable?

Well, who knows? We cannot say for sure; there is always some residual risk. Testing is not verification. Maybe the next test input that is generated reveals a vulnerability.

Let's say residual risk is the probability that the next test input reveals a vulnerability that has not been found, yet. Böhme [Unresolved citation: stads.] has shown that the Good-Turing estimate of the discovery probability is also an estimate of the maxmimum residual risk.

Proof sketch (Residual Risk). Here is a proof sketch that shows that an estimator of discovery probability for an arbitrary definition of species gives an upper bound on the probability to discover a vulnerability when none has been found: Suppose, for each "old" species A (here, execution trace), we derive two "new" species: Some inputs belonging to A expose a vulnerability while others belonging to A do not. We know that only species that do not expose a vulnerability have been discovered. Hence, all species exposing a vulnerability and some species that do not expose a vulnerability remain undiscovered. Hence, the probability to discover a new species gives an upper bound on the probability to discover (a species that exposes) a vulnerability. QED.

An estimate of the discovery probability is useful in many other ways.