A student recently asked me if I could show them why Bayes Theorem looks the way it does. If we are trying to determine the probability A given B, how did we arrive at the theorem in the title?

Let’s create some data and see if we can sort it out. We will simulate a 2×2 table, similar to one we might find in sports medicine journals when looking at some type of test (positive or negative) and some type of outcome (disease or no disease).

dat_tbl <- data.frame( test = c("positive", "negative", "total"), disease = c(25, 5, 30), no_disease = c(3, 40, 43), total = c(28, 45, 73) ) dat_tbl

A logical question we’d like to answer here is, ** “What is the probability of disease given a positive test.” **Written in probability notation we are asking,

*.*

**p(Disease | Positive)**Using the data in the table, we can quickly compute this as 25 people who were positive and had the disease divided by 28 total positive tests. **25/28 = 89.3%**

Of course, we could also compute this using Bayes Theorem:

**p(Disease | Positive) = p(Disease) * p(Positive | Disease) / p(Positive)**

We store the necessary values in R objects and then compute the result

### What we want to know: p(Disease | Positive) # p(Disease | Positive) = p(Disease) * p(Positive | Disease) / p(Positive) # p(A | B) = p(A) * p(B|A) / p(B) p_disease <- 30/73 p_positive_given_disease <- 25/30 p_positive <- 28/73 p_no_disease <- 43/73 p_positive_given_no_disease <- 3/43 p_disease_given_positive <- (p_disease * p_positive_given_disease) / (p_disease * p_positive_given_disease + p_no_disease * p_positive_given_no_disease) p_disease_given_positive

This checks out. We get the exact same result as when we did 25/28.

**Okay, how did we get here? Why does it work out like that?**

The math works out because we start with two different joint probabilities, p(A n B) and p(B n A). Or, in our case, p(Disease n Positive) and p(Positive n Disease). Formally, we can write these as follows:

We’ve already stored the necessary probabilities in specific elements, above. But here it what they both look like using our 2×2. First I’ll calculate it with the counts directly from the table and then calculate it with the R elements that we stored. You’ll see the answers are the same.

## Joint Probability 1: p(Positive n Disease) = p(Positive | Disease) * p(Disease) 25/30 * 30/73 p_positive_given_disease * p_disease ## Joint Probability 2: p(Disease n Positive) = p(Disease | Positive) * p(Positive) 25/28 * 28/73 p_disease_given_positive * p_positive

* Well, would you look at that! *The two joint probabilities are equal to each other. We can formally test that they are equal to each other by setting up a logical equation in R.

## These two joint probabilities are equal! # p(Positive n Disease) = p(Disease n Positive) (p_positive_given_disease * p_disease) == (p_disease_given_positive * p_positive)

So, if they are equal, what we are saying is this:

** p(Disease | Positive) * p(Positive) = p(Positive | Disease) * p(Disease)**

Now, with some algebra, we can divide the right side of the equation by * p(Positive)* and we are left with Bayes Theorem for our problem:

**p(Disease | Positive) = p(Disease) * p(Positive | Disease) / p(Positive)**

Putting it altogether, it looks like this:

So, all we’ve done is taken two joint probabilities and used some algebra to arrange the terms in order to get us to the conditional probability we were interested in, * p(Disease | Positive)* and we end up with Bayes Theorem.