A favorite paper of mine is the 1986 paper by Gardner and Altman regarding confidence intervals and estimation as a more useful way of reporting data than a dichotomous p-value:
Gardner, MJ. Altman, DG. (1986). Confidence intervals rather than P values: Estimation rather than hypothesis testing. Brit Med J; 292:746-750.
In this paper, Gardner and Altman discuss three main points for either moving away from or supplementing statistical reporting with p-values:
- Research often focuses on null hypothesis significance testing with the goal being to identify statistically significant results.
- However, we are often interested in the magnitude of the factor of interest.
- Given that research deals with samples of a broader population, the readers are not only interested in the observed magnitude of the estimand but also the degree of variability and plausible range of values for the population. This variability can be quantified using confidence intervals.
Aside from the paper providing a clear explanation of the issue at hand, their appendix offers the equations to calculate confidence intervals for means, differences in means, proportions, and differences in proportions. Thus, I decided to compile the appendix in an R script for those looking to code confidence intervals (and not have to rely on pre-built functions).
All of this code is available on my GITHUB page.
Confidence Intervals for Means and Differences
Single Sample
- Obtain the mean
- Calculate the standard error (SE) of the mean as SE = SD/sqrt(N)
- Multiply by a t-critical value specific to the level of confidence of interest and the degrees of freedom (DF) for the single sample, DF = N – 1
The confidence intervals are calculated as:
Low = mean – t_{crit} * SE
High = mean + t_{crit} * SE
Example
We collect data on 30 participants on a special test of strength and observe a mean of 40 and standard deviation of 10. We want to calculate the 90% confidence interval.
The steps above can easily be computed in R. First, we write down the known info (the data that is provided to us). We then calculate the standard error and degrees of freedom (N – 1). To obtain the critical value for our desired level of interest, we use the t-distribution is specific to the degrees of freedom of our data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | N <- 30
avg <- 40
SD <- 10
SE <- SD / sqrt (N)
DF <- N - 1
level_of_interest <- 0.90
tail_value <- (1 - level_of_interest)/2
t_crit <- abs ( qt (tail_value, df = DF))
t_crit
low90 <- round (avg - t_crit * SE, 1)
high90 <- round (avg + t_crit * SE, 1)
cat ( "The 90% Confidence Interval is:" , low90, " to " , high90)
|

Two Samples
- Obtain the sample mean and standard deviations for the two samples
- Pool the estimate of the standard deviation:s = sqrt(((n_1 – 1)*s^2_1 + n_2 – 1)*s^2_2) / (n_1 + n_2 – 2))
- Calculate the SE for the difference:SE_{diff} = s * sqrt(1/n_1 + 1/n_2)
- Calculate the confidence interval as:
Low = (x_1 – x_2) – t_{crit} * SE_{diff}
High = (x_1 – x_2) + t_{crit} * SE_{diff}
Example
The example in the paper provides the following info:
- Blood pressure levels were measured in 100 diabetic and 100 non-diabetic men aged 40-49 years old.
- Mean systolic blood pressure was 146.4 mmHg (SD = 18.5) in diabetics and 140.4 mmHg (SD = 16.8) in non-diabetics.
Calculate the 95% CI.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | N_diabetic <- 100
N_non_diabetic <- 100
diabetic_avg <- 146.4
diabetic_sd <-18.5
non_diabetic_avg <- 140.4
non_diabetic_sd <- 16.8
group_diff <- diabetic_avg - non_diabetic_avg
pooled_sd <- sqrt (((N_diabetic - 1)*diabetic_sd^2 + (N_non_diabetic - 1)*non_diabetic_sd^2) / (N_diabetic + N_non_diabetic - 2))
se_diff <- pooled_sd * sqrt (1/N_diabetic + 1/N_non_diabetic)
level_of_interest <- 0.95
tail_value <- (1 - level_of_interest)/2
t_crit <- abs ( qt (tail_value, df = N_diabetic + N_non_diabetic - 2))
t_crit
low95 <- round (group_diff - t_crit * se_diff, 1)
high95 <- round (group_diff + t_crit * se_diff, 1)
cat ( "The 95% Confidence Interval is:" , low95, " to " , high95)
|

Confidence Intervals for Proportions
Single Sample
- Obtain the proportion for the population
- Calculate the SE of the proportion, SE = sqrt((p * (1-p)) / N)
- Obtain the z-critical value from a standard normal distribution for the level of confidence of interest (since the value for a proportion does not depend on sample size as it does for means).
- Calculate the confidence interval:
low = p – z_{crit} * SE
high = p + z_{crit} * SE
Example
We observe a basketball player with 80 field goal attempts and a FG% of 39%. Calculate the 90% CI.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | N <- 80
fg_pct <- 0.39
se <- sqrt ((fg_pct * (1 - fg_pct)) / N)
level_of_interest <- 0.95
tail_value <- (1 - level_of_interest) / 2
z_crit <- qnorm (p = tail_value, lower.tail = FALSE )
low95 <- round (fg_pct - z_crit * se, 3)
high95 <- round (fg_pct + z_crit * se, 3)
cat ( "The 95% Confidence Interval is:" , low95, " to " , high95)
|

Two Samples
- Calculate the difference in proportions between the two groups
- Calculate the SE of the difference in proportions:SE_{diff} = sqrt(((p_1 * (1-p_1)) / n_1) + ((p_2 * (1 – p_2)) / n_2))
- Calculate the z-critical value for the level of interest
- Calculate the confidence interval as:
low = (p_1 – p_2) – (z_{crit} * se_{diff})
high = (p_1 – p_2) + (z_{crit} * se_{diff})
Example of two unpaired samples
The study provides the following table of example data:
1 2 3 4 5 | data.frame (
response = c ( "improvement" , "no improvement" , "total" ),
treatment_A = c (61, 19, 80),
treatment_B = c (45, 35, 80)
)
|

The difference we are interested in is between the proportion who improved in treatment A and the proportion of those who improved in treatment B.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | pr_A <- 61/80
n_A <- 80
pr_B <- 45/80
n_B <- 80
diff_pr <- pr_A - pr_B
se_diff <- sqrt ((pr_A * (1 - pr_A))/n_A + (pr_B * (1 - pr_B))/n_B)
level_of_interest <- 0.95
tail_value <- (1 - level_of_interest) / 2
z_crit <- qnorm (p = tail_value, lower.tail = FALSE )
low95 <- round (diff_pr - z_crit * se_diff, 3)
high95 <- round (diff_pr + z_crit * se_diff, 3)
cat ( "The 95% Confidence Interval is:" , low95, " to " , high95)
|

Example for two paired samples
We can organize the data in a table like this:
1 2 3 4 5 | data.frame (
test_1 = c ( "Present" , "Present" , "Absent" , "Absent" ),
test_2 = c ( "Present" , "Absent" , "Present" , "Absent" ),
number_of_subjects = c ( "a" , "b" , "c" , "d" )
)
|

Let’s say we measured a group of subjects for a specific disease twice in a study. A subject either has the disease (present) or does not (absent) in the two time points. We observe the following data:
1 2 3 4 5 6 7 8 9 10 | dat <- data.frame (
test_1 = c ( "Present" , "Present" , "Absent" , "Absent" ),
test_2 = c ( "Present" , "Absent" , "Present" , "Absent" ),
number_of_subjects = c (10, 25, 45, 5)
)
dat
N <- sum (dat$number_of_subjects)
|
If we care about comparing those that had the disease (Present) on both occasions (both Test1 and Test2) we calculate them as:
p_1 = (a + b) / N
p_2 = (a + c) / N
Diff = p_1 – p_2
The SE of the difference is:
SE_{diff} = 1/N * sqrt(b + c – (b-c)^2/N)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | p1 <- (10 + 25) / N
p2 <- (10 + 45) / N
diff_prop <- p1 - p2
diff_prop
se_diff <- 1 / N * sqrt (25 + 45 - (25+45)^2/N)
level_of_interest <- 0.95
tail_value <- (1 - level_of_interest) / 2
z_crit <- qnorm (p = tail_value, lower.tail = FALSE )
low95 <- round (diff_prop - z_crit * se_diff, 3)
high95 <- round (diff_prop + z_crit * se_diff, 3)
cat ( "The 95% Confidence Interval is:" , low95, " to " , high95)
|

As always, all of this code is available on my GITHUB page.
And, if you’d like to read the full paper, you can find it here:
Gardner, MJ. Altman, DG. (1986). Confidence intervals rather than P values: Estimation rather than hypothesis testing. Brit Med J; 292:746-750.