A student recently asked me if I could show them why Bayes Theorem looks the way it does. If we are trying to determine the probability A given B, how did we arrive at the theorem in the title?

Let’s create some data and see if we can sort it out. We will simulate a 2×2 table, similar to one we might find in sports medicine journals when looking at some type of test (positive or negative) and some type of outcome (disease or no disease).

dat_tbl <- data.frame(
test = c("positive", "negative", "total"),
disease = c(25, 5, 30),
no_disease = c(3, 40, 43),
total = c(28, 45, 73)
)
dat_tbl

A logical question we’d like to answer here is, *“What is the probability of disease given a positive test.” *Written in probability notation we are asking, **p(Disease | Positive)**.

Using the data in the table, we can quickly compute this as 25 people who were positive and had the disease divided by 28 total positive tests. **25/28 = 89.3%**

Of course, we could also compute this using Bayes Theorem:

**p(Disease | Positive) = p(Disease) * p(Positive | Disease) / p(Positive)**

We store the necessary values in R objects and then compute the result

### What we want to know: p(Disease | Positive)
# p(Disease | Positive) = p(Disease) * p(Positive | Disease) / p(Positive)
# p(A | B) = p(A) * p(B|A) / p(B)
p_disease <- 30/73
p_positive_given_disease <- 25/30
p_positive <- 28/73
p_no_disease <- 43/73
p_positive_given_no_disease <- 3/43
p_disease_given_positive <- (p_disease * p_positive_given_disease) / (p_disease * p_positive_given_disease + p_no_disease * p_positive_given_no_disease)
p_disease_given_positive

This checks out. We get the exact same result as when we did 25/28.

**Okay, how did we get here? Why does it work out like that?**

The math works out because we start with two different joint probabilities, p(A n B) and p(B n A). Or, in our case, p(Disease n Positive) and p(Positive n Disease). Formally, we can write these as follows:

We’ve already stored the necessary probabilities in specific elements, above. But here it what they both look like using our 2×2. First I’ll calculate it with the counts directly from the table and then calculate it with the R elements that we stored. You’ll see the answers are the same.

## Joint Probability 1: p(Positive n Disease) = p(Positive | Disease) * p(Disease)
25/30 * 30/73
p_positive_given_disease * p_disease
## Joint Probability 2: p(Disease n Positive) = p(Disease | Positive) * p(Positive)
25/28 * 28/73
p_disease_given_positive * p_positive

**Well, would you look at** **that! **The two joint probabilities are equal to each other. We can formally test that they are equal to each other by setting up a logical equation in R.

## These two joint probabilities are equal!
# p(Positive n Disease) = p(Disease n Positive)
(p_positive_given_disease * p_disease) == (p_disease_given_positive * p_positive)

So, if they are equal, what we are saying is this:

** p(Disease | Positive) * p(Positive) = p(Positive | Disease) * p(Disease)**

Now, with some algebra, we can divide the right side of the equation by **p(Positive)** and we are left with Bayes Theorem for our problem:

**p(Disease | Positive) = p(Disease) * p(Positive | Disease) / p(Positive)**

Putting it altogether, it looks like this:

So, all we’ve done is taken two joint probabilities and used some algebra to arrange the terms in order to get us to the conditional probability we were interested in, **p(Disease | Positive)** and we end up with Bayes Theorem.