A student recently asked me if I could show them why Bayes Theorem looks the way it does. If we are trying to determine the probability A given B, how did we arrive at the theorem in the title?
Let’s create some data and see if we can sort it out. We will simulate a 2×2 table, similar to one we might find in sports medicine journals when looking at some type of test (positive or negative) and some type of outcome (disease or no disease).
dat_tbl <- data.frame( test = c("positive", "negative", "total"), disease = c(25, 5, 30), no_disease = c(3, 40, 43), total = c(28, 45, 73) ) dat_tbl
A logical question we’d like to answer here is, “What is the probability of disease given a positive test.” Written in probability notation we are asking, p(Disease | Positive).
Using the data in the table, we can quickly compute this as 25 people who were positive and had the disease divided by 28 total positive tests. 25/28 = 89.3%
Of course, we could also compute this using Bayes Theorem:
p(Disease | Positive) = p(Disease) * p(Positive | Disease) / p(Positive)
We store the necessary values in R objects and then compute the result
### What we want to know: p(Disease | Positive) # p(Disease | Positive) = p(Disease) * p(Positive | Disease) / p(Positive) # p(A | B) = p(A) * p(B|A) / p(B) p_disease <- 30/73 p_positive_given_disease <- 25/30 p_positive <- 28/73 p_no_disease <- 43/73 p_positive_given_no_disease <- 3/43 p_disease_given_positive <- (p_disease * p_positive_given_disease) / (p_disease * p_positive_given_disease + p_no_disease * p_positive_given_no_disease) p_disease_given_positive
This checks out. We get the exact same result as when we did 25/28.
Okay, how did we get here? Why does it work out like that?
The math works out because we start with two different joint probabilities, p(A n B) and p(B n A). Or, in our case, p(Disease n Positive) and p(Positive n Disease). Formally, we can write these as follows:
We’ve already stored the necessary probabilities in specific elements, above. But here it what they both look like using our 2×2. First I’ll calculate it with the counts directly from the table and then calculate it with the R elements that we stored. You’ll see the answers are the same.
## Joint Probability 1: p(Positive n Disease) = p(Positive | Disease) * p(Disease) 25/30 * 30/73 p_positive_given_disease * p_disease ## Joint Probability 2: p(Disease n Positive) = p(Disease | Positive) * p(Positive) 25/28 * 28/73 p_disease_given_positive * p_positive
Well, would you look at that! The two joint probabilities are equal to each other. We can formally test that they are equal to each other by setting up a logical equation in R.
## These two joint probabilities are equal! # p(Positive n Disease) = p(Disease n Positive) (p_positive_given_disease * p_disease) == (p_disease_given_positive * p_positive)
So, if they are equal, what we are saying is this:
p(Disease | Positive) * p(Positive) = p(Positive | Disease) * p(Disease)
Now, with some algebra, we can divide the right side of the equation by p(Positive) and we are left with Bayes Theorem for our problem:
p(Disease | Positive) = p(Disease) * p(Positive | Disease) / p(Positive)
Putting it altogether, it looks like this:
So, all we’ve done is taken two joint probabilities and used some algebra to arrange the terms in order to get us to the conditional probability we were interested in, p(Disease | Positive) and we end up with Bayes Theorem.