By definition, the probability of event A given event B is

And so we can play the following game:

(This just means that the probability of A and B is the same as the probability of B and A.) This is equivalent to

Now we can divide both sides by P(A), and we get Bayes’ Theorem:

Furthermore, if event A is partitioned into several parts, then we can write Bayes’ theorem this way:

In my experience, this is usually how Bayes’ theorem is presented.

But let’s go back to two equations ago. And I’m going to write it slightly differently:

This form makes it obvious that we simply have to multiply P(A|B) by the ratio of P(B)/P(A) to get P(B|A). Let’s go through an example.

Suppose a patient shows up to the emergency room with a symptom: a migraine headache. We know that the P( migraine | brain bleed ) is very high; let’s give it a fake number: 0.98. That seems bad. Basically, anyone who has a brain bleed has a migraine.

But what we really want to know is P( brain bleed | migraine ). How do we get this number? We simply multiply P( migraine | brain bleed ) by the ratio of brain bleeds to migraines in the population. That’s what Baye’s rule says! You take the likelihood P(A|B) and multiply it by P(B)/P(A)!

Let’s say that a migraine shows up on any given day in 1% of the population. But a brain bleed shows up on any given day in 0.000001% of the population. (I’m totally making these numbers up.) Then we would multiply 98% by 0.000001/1 to get the probability that it’s a brain bleed, which would yield a probability of 0.0098%.