What is Bayes’s Theorem?

(Feel free to skip this section if you already understand Bayes’s Theorem)

Bayes’s Theorem fundamentally is based on the concept of “validity of Beliefs”. Reverend Thomas Bayes was a Presbyterian minster and a Mathematician who pondered much about developing the proof of existence of God. He came up with the Theorem in 18th century (which was later refined by Pierre-Simmon Laplace) to fix or establish the validity of ‘existing’ or ‘previous’ beliefs in the face of best available ‘new’ evidence. Think of it as a equation to correct prior beliefs based on new evidence.

One of the popular example used to explain Bayes’s Theorem is to detect if a patient has a certain disease or not.

The key inferences in the Theorem is a follows:

Event: An event is a fact. The patient truly having a disease is an event. Also, truly NOT having the disease is also an event.

Test: A test is a mechanism to detect if a patient has the disease (or a test devised to prove that a patient does not have the disease. Note that they are not the same tests)

Subject: A patient is a subject who may or may not have the disease. A test needs to be devised for the subject to detect the presence of disease or devise a test to prove that the disease does not exist.

Test Reliability: A test devised to detect the disease may not be 100% reliable. The test may not possibly detect the disease all the time. When the detection fails to recognize the disease in a subject who truly has the disease, we call them false negatives. Also the test on the subject who truly does not have the disease may show that the subject does have the disease. This is called false positives.

Test Probability: This the probability of a test to detect the event (disease) given a subject (patient). This does not account the Test Reliability.

Event Probability (Posterior Probability): This is the “corrected” test probability to detect the event given a subject by considering the reliability of the devised test.

Belief (Prior Probability): A belief, also called a prior probability (or prior in short) is the subjective assumption that disease exits in a patient (based on symptoms or other subjective observations) prior to conducting the test. This is the most important concept in Bayes’s Theorem. You need to start with the priors (or Beliefs) before you make corrections to that belief.

The following is the equation which shall accommodate the stated concepts.

In the equation,

A1.. A2.. are the events. A1 and A2 are mutually exclusive and collectively exhaustive. Let A1 mean that the disease is present in the subject and A2 mean that the disease is absent.

Let Ai refer to either one of the event A1 or A2.

B is a test devised to detect the disease (alternatively, it can also be a test that is devised to prove that the disease does not exist in the subject. Again, note that these are completely two different tests)

Let us say there is a population of people (in a random city) where there is a prior belief (based on some random observation, which may or may not be subjective) that 5% of the population “has the disease”. So, for any given subject in the population, the prior probability P(A1) “has the disease” is 5% and the prior probability P(A2) “does not have the disease” is 95%.

Let’s say, the test ‘B’ which is devised to “detect” the presence of a disease has a reliability of 90% (In other words, it detects the presence of a disease in a patient who truly have the disease only 9 out of 10 tests). Written mathematically, the probability of the test to detect a disease when the disease is truly present P(B|A1) = 0.9.

Unfortunately, the test ‘B’ also has a flaw which sometimes shows that the patient has the disease even when the disease is truly not present in the patient. Let us say that the 2 out of 10 patients who really does not have a disease gets falsely detected as having a disease. Mathematically, P(B|A2) = 0.2.

Now, if you randomly select a subject from the population and conduct the test on the subject, AND if the test result shows positive (The patient does have the disease), can we calculate the “Event probability” (or the Posterior Probability) of the person truly having the disease

Mathematically, calculate P(A1|B). Which can be read as, calculate the probability of A1 (presence of disease), given B (given test results being positive)

So let’s assign the values for each probabilities.

Prior Probability of person having the disease = P(A1) = 0.05

Prior Probability of person NOT having disease = P(A2) = 0.95

Conditional Probability that the test shows positive, given that the person truly does have a disease = P(B|A1) = 0.9

Conditional Probability that the test shows positive, even if the person truly does NOT have a disease = P(B|A2) = 0.2

What is the “event probability” of a randomly selected person from the population who was performed the test, and the test result shows positive, to truly have the disease? = What is P(truly has disease given test is positive) = P(A1|B)?

The posterior probability can be calculated based on Bayes’s Theorem as follows:

So the posterior probability of the person truly having the disease, given that the test result is positive is only 19% !! Note the stark difference in the corrected probability even if the test results are 90% accurate ? Why do you think, this is the case?

The answer lies in the ‘priors’. Note that the “belief” that only 5% of the population may have a disease, is the strong reason for a 19% posterior probability. It’s easy to prove. Change your prior beliefs (all else being equal) from 5% to let’s say a 30%. Then you shall get the following results.

Note that the posterior probability for the same test with a higher prior jumped significantly to 65%.

Hence, while all evidence and tests being equal, Bayes’s theorem is strongly influenced by priors. If you start with a very low prior, even in the face of strong evidence the posterior probability will be closer to the prior (lower).

A prior is not something you randomly make up. It should be based on observations even if subjective. There should be some emphasis on why someone holds on to a belief before assigning a percentage.