Category Archives: Sports Analytics

Deriving a Confidence Interval from an Estimate and a p-value

Although most journals require authors to include confidence intervals into their papers it isn’t mandatory for all journal (merely a recommendation). Additionally, there may be occasions when you are reading an older paper from a time when this mandate/recommendation was not enforced. Finally, sometimes abstracts (due to word count limits) might only present p-values and estimates, in which case you might want to quickly obtain the confidence intervals to help organize your thoughts prior to diving into the paper. In these instances, you might be curious as to how you can get a confidence interval around the observed effect when all you have is a p-value.

Bland & Altman wrote a short piece of deriving confidence interval from only an estimate and a p-value in a 2011 paper in the British Medical Journal:

Altman, DG. Bland, JM. (2011). How to obtain the confidence interval from a p-value. BMJ, 343: 1-2.

Before going through the approach, it is important to note that they indicate a limitation of this approach is that it wont be as accurate in smaller samples, but the method can work well in larger studies (~60 subjects or more).

The Steps

The authors’ list 3 easy steps to derive the confidence interval from an estimate and p-value:

Calculate the test statistic for a normal distribution from the p-value.
Calculate the standard error (ignogre the minus sign).
Calculate the 95% CI using the standard error and a z-critical value for the desired level of confidence.
When doing this approach for a ratio (e.g., Risk Ratio, Odds Ratio, Hazard Ratio), the formulas should be used with the estimate on the log scale (if it already isn’t) and then exponentiate (antilog) the confidence intervals to put the results back to the normal scale.

Calculating the test statistic

To calculate the test statistic use the following formula:

z = -0.862 + sqrt(0.743 – 2.404 * log(p.value))

Calculating the standard error

To calculate the standard error use the following formula (remember that we are ignoring the sign of the estimate):

se = abs(estimate) / z

If we are dealing with a ratio, make sure that you are working on the log scale:

se = abs(log(estimate)) / z

Calculating the 95% Confidence Limits

Once you have the standard error, the 95% Confidence Limits can be calculated by multiplying the standard error by the z-critical value of 1.96:

CL.95 = 1.96 * se

From there, the 95% Confidence Interval can be calculated:

low95 = Estimate – CL.95
high95 = Estimate + CL.95

Remember, if you are working with rate statistics and you want to get the confidence interval on the natural scale, you will need to take the antilog:

low95 = exp(Estimate – CL.95)
high95 = exp(Estimate + CL.95)

Writing a function

To make this simple, I’ll write everything into a function. The function will take three arguments, which you will need to obtain from the paper:

p-value
The estimate (e.g., difference in means, risk ratio, odds ratio, hazard ratio, etc)

The function will default to log = FALSE but if you are working with a rate statistic you can change the argument to log = TRUE to get the results on both the log and natural scales. The function also takes a sig_digits argument, which defaults to 3 but can be changed depending on how many significant digits you need.

estimate_ci_95 <- function(p_value, estimate, log = FALSE, sig_digits = 3){
  
  if(log == FALSE){
    
    z <- -0.862 + sqrt(0.743 - 2.404 * log(p_value))
    z

    se <- abs(estimate) / z
    se
    
    cl <- 1.96 * se
    
    low95 <- estimate - cl
    high95 <- estimate + cl
    
    list('Standard Error' = round(se, sig_digits),
         '95% CL' = round(cl, sig_digits),
         '95% CI' = paste(round(low95, sig_digits), round(high95, sig_digits), sep = ", "))
    
  } else {
    
    if(log == TRUE){
      
      z <- -0.862 + sqrt(0.743 - 2.404 * log(p_value))
      z
      
      se <- abs(estimate) / z
      se
      
      cl <- 1.96 * se
      
      low95_log_scale <- estimate - cl
      high95_log_scale <- estimate + cl
      
      low95_natural_scale <- exp(estimate - cl)
      high95_natural_scale <- exp(estimate + cl)
      
      list('Standard Error (log scale)' = round(se, sig_digits),
           '95% CL (log scale)' = round(cl, sig_digits),
           '95% CL (natural scale)' = round(exp(cl), sig_digits),
           '95% CI (log scale)' = paste(round(low95_log_scale, sig_digits), round(high95_log_scale, sig_digits), sep = ", "),
           '95% CI (natural scale)' = paste(round(low95_natural_scale, sig_digits), round(high95_natural_scale, sig_digits), sep = ", "))
      
    }
    
  }
  
}

Test the function out

The paper provides two examples, one for a difference in means and the other for risk ratios.

Example 1

Example 1 states:

“the abstract of a report of a randomised trial included the statement that “more patients in the zinc group than in the control group recovered by two days (49% v 32%,P=0.032).” The difference in proportions was Est = 17 percentage points, but what is the 95% confidence interval (CI)?

estimate_ci_95(p_value = 0.032, estimate = 17, log = FALSE, sig_digits = 1)

Example 2

Example 2 states:

“the abstract of a report of a cohort study includes the statement that “In those with a [diastolic blood pressure] reading of 95-99 mm Hg the relative risk was 0.30 (P=0.034).” What is the confidence interval around 0.30?”

Here we change the argument to log = TRUE since this is a ratio statistic needs to be on the log scale.

estimate_ci_95(p_value = 0.034, estimate = log(0.3), log = TRUE, sig_digits = 2)

Try the approach out on a different data set to confirm the confidence intervals are calculated properly

Below, we build a simple logistic regression model for the PimaIndiansDiabetes data set from the {mlbench} package.

The odds ratios are already on the log scale so we set the argument log = TRUE to ensure we get results reported back to us on the natural scale, as well.
We use the summary() function to obtain the model estimates and p-values.
We use the confint() function to get the 95% Confidence Intervals from the model.
To get the confidence intervals on the natural scale we also take the exponent, exp(confint()).
We use our custom function, estimate_ci_95(), to see how well the results compare.

## get data
library(mlbench)
data("PimaIndiansDiabetes")
df <- PimaIndiansDiabetes

## turn outcome variable into a numeric (0 = negative for diabetes, 1 = positive for diabetes)
df$diabetes_num <- ifelse(df$diabetes == "pos", 1, 0)
head(df)

## simple model
diabetes_glm <- glm(diabetes_num ~ pregnant + glucose + insulin, data = df, family = "binomial")

## model summary
summary(diabetes_glm)

Calculate 95% CI from the p-values and odds ratio estimates.

Pregnant Coefficient

## 95% CI for the pregnant coefficient
estimate_ci_95(p_value = 2.11e-06, estimate = 0.122, log = TRUE, sig_digits = 3)

Glucose Coefficient

## 95% CI for the glucose coefficient
estimate_ci_95(p_value = 2e-16, estimate = 0.0375, log = TRUE, sig_digits = 3)

Insulin Coefficient

## 95% CI for the insulin coefficient
estimate_ci_95(p_value = 0.677, estimate = -0.0003, log = TRUE, sig_digits = 5)

Evaluate the results from the custom function to those calculated with the confint() function.

## Confidence Intervals on the Log Scale
confint(diabetes_glm)

## Confidence Intervals on the Natural Scale
exp(confint(diabetes_glm))

We get nearly the same results with a bit of rounding error!

Hopefully this function will be of use to some people as they read papers or abstracts.

You can find the complete code in a cleaned up markdown file on my GITHUB page.

Tyrese Maxey’s 3pt%, Bayes, Shrinkage

Some friends were discussing Philadelphia 76er’s point guard, Tyrese Maxey’s, three point% today. They were discussing how well he has performed over 72 games with a success rate of 43% behind the arc (at the time this data was scraped, 4/6/2022). While his percentage from 3pt range is very impressive I did notice that he has 294 attempts, which is less than 3 out of the 4 player’s that are ahead of him (Kyrie only has 214 attempts and he is ranked 3rd at the time of this writing) and Steph Curry is just behind Maxey in the ranking (42.4% success) with nearly 70 more attempts.

The question becomes, how can we be of Maxey’s three point percentage relative to those with more attempts? We will take a Bayesian approach, using a beta conjugate, to consider the success rate of these players relative to what we believe the average three point success rate is for an NBA shooter (our prior), which we will determine from observing 3 point shooting over previous 3 seasons.

NOTE: On basketball-reference.com, they have a nice check box that automatically will filter out players that are non-qualifiers for rate stats. After playing around with this, it appears that 200 attempts is their cut off. So, I will keep that and filter the data down to only those with 200 or more 3pt attempts.

All of the code, web scrapping, and csv files of the data (if you are looking to run it prior to when I scrapped it) are available on my GITHUB PAGE.

Exploratory Data Analysis

First, let’s view the top 10 three point shooters this season (size of the dot represents the number of three point attempts taken).

Visualize the distribution of three point attempts and three point% for the 2022 season, so far.

Establishing Our Prior

Since we are dealing with a binary outcome of successes (made the shot) and failures (missed the shot) we will use the beta distribution, which is the conjugate prior for the binomial distribution.

The beta distribution has two parameters, alpha and beta. To determine what these parameters should be, we will use the method of moments with the data from the previous three seasons.

To do this, we need to first find the mean and variance for the previous three seasons.

Next, we create a function that calculates alpha and beta based on the mean and variance from our observed data.

# function for calculating alpha and beta
beta_parameters <- function(dist_avg, dist_var){
  alpha <- dist_avg * (dist_avg * (1 - dist_avg)/dist_var - 1)
  beta <- alpha * (1 - dist_avg)/dist_avg
  list(alpha = alpha,
       beta = beta)
}

The function works to produce the two parameters we need. The data is returned in list format, so we will call each element of the list and store the respective values in their own variable.

The alpha and beta parameters derived from our method of moments function appear to produce the mean and standard deviation correctly. We can plot this distribution to see what it looks like.

Update the 3pt% of the players in the 2022 season with our meta prior

We calculate our Bayes adjusted three point percentage for the players by adding their successes to `alpha` and their failures to `beta` and then calculating the new posterior percentage as

alpha / (alpha + beta)

and the posterior standard deviation as

sqrt((alpha * beta) / ((alpha + beta)^2 * (alpha + beta + 1)))

tbl2022 <- tbl2022 %>%
  mutate(three_pt_missed = three_pt_att - three_pt_made,
         posterior_alpha = three_pt_made + alpha,
         posterior_beta = three_pt_missed + beta,
         posterior_three_pt_pct = posterior_alpha / (posterior_alpha + posterior_beta),
         posterior_three_pt_sd = sqrt((posterior_alpha * posterior_beta) / ((posterior_alpha + posterior_beta)^2 * (posterior_alpha + posterior_beta + 1))))

Have any of the players in the top 10 changes following in the adjustment?

We see that Desmond Bane has jumped Kyrie, who only had 214 attempts. Kyrie dropped from 3rd to 6th.
Tyrese Maxey moves up one spot to 4.
Grant Williams drops out of the top 10 while Tyrese Haliburton moves up into the top 10

We can plot the results of these top 10 players showing the posterior Bayes three point% relative to their observed three point%.

Show the uncertainty in Tyrese Maxies Performance versus Luke Kennard, who has 409 attempts

kennard <- tbl2022 %>%
  filter(player == "Luke Kennard")

maxey <- tbl2022 %>%
  filter(player == "Tyrese Maxey")

plot(density(rbeta(n = 1e6, shape1 = maxey$posterior_alpha, shape2 = maxey$posterior_beta)),
     col = "blue",
     lwd = 4,
     ylim = c(0, 20),
     xlab = "3pt %",
     main = "Bayes Adjusted 3pt%\nBlue = Tyrese Maxey | Red = Luke Kennard")
lines(density(rbeta(n = 1e6, shape1 = kennard$posterior_alpha, shape2 = kennard$posterior_beta)),
      col = "red",
      lwd = 4)

If we sample from the posterior for both players, how much better is Kennard?

maxey_sim <- rbeta(n = 1e6, shape1 = maxey$posterior_alpha, shape2 = maxey$posterior_beta)

kennard_sim <- rbeta(n = 1e6, shape1 = kennard$posterior_alpha, shape2 = kennard$posterior_beta)

plot(density(kennard_sim - maxey_sim),
     lwd = 4,
     col = "black",
     main = "Kennard Posterior Sim - Maxie Posterior Sim",
     xlab = "Difference between Kennard & Maxie")
abline(v = 0,
       lwd = 4,
       lty = 2,
       col = "red")

On average, Kennard was better in ~74% of the 1,000,000 simulations.

Long story short, Tyrese Maxey has been a solid 3pt shooter, he just happens to play on a team where James Harden takes many of the shots (maybe he should distribute the ball more?).

One last thing…..Shrinkage

So, what happened? Basically, the Bayesian adjustment created “shrinkage” whereby the players that are above average are pulled down slightly towards the population average and the players below average are pulled up slightly towards the population average. The amount of shrinkage depends on the number of attempts the player has had (the size of their sample). More attempts leads to less shrinkage (more certainty about their performance) and smaller attempts leads to more shrinkage (more certainty about their). Basically, if we haven’t seen you shoot very much then our best guess is that you are probably closer to average until we are provided more evidence to believe otherwise.

Since we were originally dealing with only players that have had 200 or more three point attempts, let’s scrape all players from the 2022 season and apply the same approach to see what shrinkage looks like.

url2022 <- read_html("https://www.basketball-reference.com/leagues/NBA_2022_totals.html")

tbl2022a <- html_nodes(url2022, 'table') %>%
  html_table(fill = TRUE) %>%
  pluck(1) %>%
  janitor::clean_names() %>%
  select("player", three_pt_att = "x3pa", three_pt_made = "x3p", three_pt_pct = "x3p_percent") %>%
  filter(player != "Player") %>%
  mutate(across(.cols = three_pt_att:three_pt_pct,
                ~as.numeric(.x))) %>%
  filter(!is.na(three_pt_pct)) %>%
  arrange(desc(three_pt_pct)) %>%
  mutate(three_pt_missed = three_pt_att - three_pt_made,
         posterior_alpha = three_pt_made + alpha,
         posterior_beta = three_pt_missed + beta,
         posterior_three_pt_pct = posterior_alpha / (posterior_alpha + posterior_beta),
         posterior_three_pt_sd = sqrt((posterior_alpha * posterior_beta) / ((posterior_alpha + posterior_beta)^2 * (posterior_alpha + posterior_beta + 1))))


tbl2022a %>%
  mutate(pop_avg = alpha / (alpha + beta)) %>%
  ggplot(aes(x = three_pt_pct, y = posterior_three_pt_pct, size = three_pt_att)) +
  geom_point(color = "black",
             alpha = 0.8) +
  geom_hline(aes(yintercept = pop_avg),
             color = "green",
             size = 1.2,
             linetype = "dashed") +
  geom_abline(intercept = 0,
              slope = 1,
              size = 1.2,
              color = "green") +
  labs(x = "Observed 3pt%",
       y = "Bayesian Adjusted 3pt%",
       size = "Attempts",
       title = "Shirnkage of 3pt% using Beta-Conjugate",
       caption = "Data Source: https://www.basketball-reference.com/leagues/NBA_2022_totals.html")

What does this tell us?

Points closest to the diagonal line (the line of equality — points on this line represent 0 difference between Bayes adjusted and Observed 3pt%) see much almost no shrinkage towards the observed 3pt%.
Notice that the points nearest the line also have tend to be larger, meaning we have more observations are more certainty of that player’s true skill.
The horizontal dashed line represents the population average (determined from the alpha and beta parameters obtained from previous 3 seasons).
Notice that the smaller points (less observations) get shrunk towards this line given we haven’t seen enough from that player to believe differently. For example, the tiny dot to the far right indicates the player has an observed 3pt% of 100%, which we wouldn’t really believe to be sustainable for the full season (maybe the player took one or two shots and got lucky?). So that point is pulled downwards towards the dashed line as our best estimate is that the player ie closer to an average shooter.

If the examples we see in textbooks don’t represent the real world, what about the sports science papers we read?

I enjoyed this short article by Andrew Gelman on the blog he keeps with several of his colleagues. The main point, which I agree 100% with, is that the examples in our stats and data analysis textbooks never seem to match what we see in the real world. The examples always seem to work! The data is clean and looks perfectly manicured for the analysis. I get it! The idea is to convey the concept of how different analytical approaches work. The rub is that once you get to the real world and look at your data you end up being like, “Uh. Wait….what is this? What do I do now?!”

The blog got me thinking about something else, though. Something that really frustrates me. If the examples we see in textbooks don’t reflect the data problems we face in the real world, what about the examples we read about in applied sport science research? How much do those examples reflect what we see in the real world?

At the risk of upsetting some colleagues, I’ll go ahead and say it:

I’m not convinced that the research we read in publications completely represents the real world either!

How can that be? This is applied science! Isn’t the real word THE research?

Well, yes and no. Yes, the data was collected in the real world, with real athletes, and in a sport setting. But often, reading a paper, looking at the aim and the conclusions, and then parsing through the methods section to see how they handled the data leaves me scratching my head.

My entire day revolves around looking at data and I can tell you; the real world is very messy.

How things get collected
How things get saved
How things get loaded
How things get logged

There are potential hiccups all along the way, no matter how stringent you are in trying to keep sound data collection practices!

In research, the problem is that the data is often touched up in some manner to create an analysis sufficient for publication. Missing values need to be handled a certain way (sometimes those rows get dropped, sometimes values get imputed), class imbalance can be an issue, errant values and outliers from technology flub-ups are a very real thing, data entry issues arise, etc. These things are all problematic and, if not identified prior to analysis, can be a major issue with the findings. I do believe that most people recognize these problems and would agree with me that they are very real issues. However, it is less about knowing that there are problems but rather, figuring out what to do about them. Often the methods sections gloss over these details (I get it, word counts for journals can be a pain in the butt) and simply produce a result that, on paper at least, seems overly optimistic. As I read through the results section, without details about data processing, I frequently say to myself, “No way. There is no way this effect is as real as they report. I can’t reproduce this without knowing how they cleaned up their data to observe this effect.”

Maybe someone should post a paper about how crappy their data is in the applied setting? Maybe we should just be more transparent about the data cleaning processes we go through so that we aren’t overly bullish on our findings and more realistic about the things we can say with confidence in the applied setting?

Does anyone else feel this way? Maybe I’m wrong and I’m being pessimistic, and this isn’t as big of an issue as I believe it to be? Is the data we see in publication truly representative of the real world?

Bayesian Updating of Reference Ranges for Serial Measurements

Introduction

The collection of serial measurements on athletes across a season (or multiple seasons) is one of the more common types of data being generated in the applied sport science environment. The question that coaches and practitioners often have is, “Is this player outside of their ‘normal range’?”

The best approach for establishing a reference range of ‘normal’ values is a frequently discussed topic in sport science. One common strategy is to use z-scores and represent the reference range as 1 standard deviation above or below the mean (Figure A) or plot the raw values and set the reference range 1 standard deviation above or below the raw mean (Figure B), for practitioners who might have a difficult time understanding standardized scores. Of course, the mean and standard deviation will now be related to all prior values. As such, if the athletes go through a training phase with substantially higher values than other phases (e.g., training camp) it could skew your reference ranges. To alleviate this issue, some choose to use a rolling mean and standard deviation, to represent the normal range of values relative to more recent training sessions (Figure C).

A problem with the approaches above is that they require a number of training sessions to allow a mean and standard deviation to be determined for the individual athlete. One solution to this issue is to base our initial normal reference ranges off of prior knowledge that we have from collecting data on players in previous seasons (or prior knowledge from research papers, if we don’t have data of our own yet). This type of Bayesian updating approach has been applied in WADA’s drug testing practices¹. More recently, Hecksteden et al., used this approach to evaluate the CK levels of team-sport athletes in both fatigued and non-fatigued states².

The mathematics of the approach was presented in the paper but might look intimidating to those not used to looking at mathematical equations in this manner.

The author’s provided a nice excel sheet where you can input your own data and get the updated reference ranges. However, the sheet is a protected sheet, which doesn’t afford the opportunity of seeing how the underlying equations work and you can’t alter the sheet to make appropriate for your data (for example, the data in the sheet log transforms the raw data automatically). Thus, I’ve decided to code the analysis out, both in excel and R, to help practitioners looking to adopt this approach.

Setting Priors

To apply this type of analysis, we need to first establish some prior values for three parameters: Prior Mean (mu), Prior Standard Deviation (tau), and a Prior Repeated-Measures Standard Deviation (sigmaRM). These values represent our current knowledge of the variable we are measuring before seeing any new data. As new data is collected, we can update these priors to get an individual (posterior) estimate for the athlete. I’ll use the priors set by Hecksteden and colleagues for CK levels of Male athletes:

Mu = 5.527
Tau = 0.661
sigmaRM = 0.504

Once we have established our prior parameters, we are ready to update them, using the math equations above, as new data comes in.

Bayesian Updating in Excel

The excel sheet is available at my GitHub page. It looks like this:

All of the heavy lifting occurs in the two columns under the header Bayesian Updating (Log Scale). The equation in the first row (Test 1) is different than the other equations below it because requires the prior information to get going. After that first test, the updated data become the prior for the next test and this continues for all tests forward. You can download the excel sheet and see how the equations work, so I won’t go through them here. Instead, I’ll show them more clearly in the R script, below.

Bayesian Updating in R

We first need to convert the math equations provided in the paper (posted above) into R code. Rather that leaving things to mathematical notation, I’ll plug in the variables in plain English:

To be clear, here are the definitions for the variables above:

Now that we know the variables we need for each equation we can begin the process of updating or reference ranges.

First create a data set of the test observations and their log values. This will be the same data we observed in our excel sheet:

Then we set our priors (in log format):

## priors
prior_mu <- 5.527
prior_sd <- 0.661
prior_repeated_measure_sd <- 0.504

We will start by seeing how the updating works for the mean and standard deviation parameters after the first test. To do this, we will create a function for each parameter (mean and standard deviation) that updates the priors with the observed values based on the above equations:

posterior_mu <- function(prior_mu, prior_sd, prior_repeated_measure_sd, obs_value){
  
  numerator <- prior_repeated_measure_sd^2 * prior_mu + prior_sd^2 * obs_value
  denominator <- prior_repeated_measure_sd^2 + prior_sd^2
  
  post_mu <- numerator / denominator
  return(post_mu)
  
  }

posterior_sd <- function(prior_repeated_measure_sd, prior_sd, test_num){
  
  post_var <- 1 / ((test_num - 1 + 1) * 1/prior_repeated_measure_sd^2 + 1/prior_sd^2) 
  post_sd <- sqrt(post_var)
  return(post_sd)
  
}

After running the functions on the observations of our first test, our updated mean and standard deviation are:

Notice that we obtain the same values that we see following test_1 in our excel workbook. We can also calculate 95% confidence intervals and take the exponent (since the data is on a log scale) to get the individual athlete’s updated reference range on the raw scale:

Again, these results confirm the values we see in our excel workbook.

That’s cool and all, but we need to be able to iteratively update the data set as new data comes in. Let’s write a for() loop!

First, we create a new column in the data that provides us with the updated standard deviation after observing the results in each test. This is a necessary first step as we will use this value to then update the mean value.

## Calculate the updated SD based on sample size
df2 <- df %>%
  mutate(bayes_sd = sqrt(1 / ((test - 1 + 1) * 1 / prior_repeated_measure_sd^2 + 1 / prior_sd^2))) 

df2

Next, we need a for() loop. This is a bit tricky because test_1 is updating based solely on the priors while all other tests (test_2 to test_N) will be updating based on the mean and standard deviation in the row above them. Thus, we need to have our for() loop look back at the previous row above it once those values are calculated. This sort of thing is easy to set up in excel but in R (or Python) we need to think about how to code our row indexes. I covered this sort of iterative row computing in two previous articles HERE and HERE.

Within the loop we first set up vectors for the prior variance (sd^2), the denominator in our equation, and the log transformed observations from our data set. Then, we calculate the updated posterior for the mean (mu) on each pass through the loop, each time using the value preceding it in the vector, [i – 1], to allow us to iteratively update the data.

Once we run the loop, we add the results to our data set (removing the first observation in the vector since that was the original prior before seeing any data):

# Create a vector to store results
N <- length(df2$ln_value) + 1
bayes_mu <- c(prior_mu, rep(NA, N - 1))


## For loop
for(i in 2:N){
  
  ## Set up vectors for the variance, denominator, and newly observed values
  prior_var <- c(prior_sd^2, df2$bayes_sd^2)
  denominator <- prior_repeated_measure_sd^2 + prior_var
  vals <- df2$ln_value
  
  ## calculate bayesian updated mu
  bayes_mu[i] <- (prior_repeated_measure_sd^2 * bayes_mu[i-1] + prior_var[i-1] * vals[i-1]) / denominator[i-1]
    
}

df2$bayes_mean <- bayes_mu[-1]
df2

The two columns, bayes_sd and bayes_mean, contain our updated prior values and they are the exact same results we obtained in our excel workbook.

To use these updated parameters for creating individual athlete reference ranges, we calculate the 95% Confidence Intervals:

NOTE: I added a row at the start of data frame to establish the priors, before seeing the data, so that they could also be plotted as part of the reference ranges.

### Confidence Intervals
first_prior <- data.frame(test = 0, value = NA, ln_value = NA, bayes_sd = prior_sd, bayes_mean = prior_mu)

df2 <- df2 %>%
  bind_rows(first_prior) %>%
  arrange(test)

## Exponentiate back to get the reference range
df2$low95 <- exp(df2$bayes_mean - 1.96*df2$bayes_sd)
df2$high95 <- exp(df2$bayes_mean + 1.96*df2$bayes_sd)
df2

Finally, we plot the observations along with the continually updated references ranges. You can clearly see how large the normal range is before seeing any data (test_0) and then how quickly this range begins to shrink down once we start observing data from the individual.

To access the R code and the excel workbook please visit my GitHub page.

References

Sottas PE et al. (2007). Bayesian detection of abnormal values in longitudinal biomarkers with application to T/E ratio. Biostatistics; 8(2): 285-296.
Hecksteden et al. (2017). A new method to individualize monitoring of muscle recovery in athletes. Int J Sport Phys Perf; 12: 1137-1142.

Force Decks – Force Plate Shiny Dashboard

Last week, two of the data scientists at Vald Performance, Josh Ruddy and Nick Murray, put out a free online tutorial on how to create a force plate reports using R with data from their Force Decks software.

It was a nice tutorial to give an overview of some of the power behind ggplot2 and the suite of packages that come with tidyverse. Since they made the data available (in the link above), I decided to pull it down and put together a quick shiny app for those that might be interested in extending the report to an interactive web app.

This isn’t the first time I’ve build a shiny app for the blog using force plate data. Interested readers might want to check out my post from a year ago where I built a shiny interactive report for force-velocity profiling.

You can watch a short preview of the end product in the below video link and the screen shots below the link show a static view of what the final shiny App will look like.

A few key features:

App always defaults to the most recent testing day on the testDay tab.
The user can select the position group at the top and that position group will be maintained across all tabs. For example, if you select Forwards, when you switch between tabs one and two, forwards will always be there.
The time series plots on the Player Time Series tab are done using plotly, so they are interactive, allowing the user to hover over each test session and see the change from week-to-week in the tool tip. When the change exceeds the meaningful change, the point turns red. Finally, because it is plotly, the user can slice out specific dates that they want to look at (as you can see me do in the video example), which comes in handy when there are a large number of tests over time.

All code and data s accessible through my GitHub page.

vald_shiny_app

Loading and preparing the data

I load the data in using read.csv() and file.choose(), so navigate to wherever you have the data on your computer and select it.
There is some light cleaning to change the date in to a date variable. Additionally, there were no player positions in the original data set, so I just made some up and joined those in.


### packages ------------------------------------------------------------------
library(tidyverse)
library(lubridate)
library(psych)
library(shiny)
library(plotly)

theme_set(theme_light())

### load & clean data ---------------------------------------------------------
cmj <- read.csv(file.choose(), header = TRUE) %>%
  janitor::clean_names() %>%
  mutate(date = dmy(date))

player_positions <- data.frame(name = unique(cmj$name),
                               position = c(rep("Forwards", times = 15),
                                            rep("Mids", times = 15),
                                            rep("Backs", times = 15)))

# join position data with jump data
cmj <- cmj %>%
  inner_join(player_positions)

Determining Typical Error and Meaningful Change

In this example, I’ll just pretend as if the first 2 sessions represented our test-retest data and I’ll work from there.
Typical Error Measurement (TEM) was calculated as the standard deviation of differences between test 1 and 2 divided by the square root of 2.
For the meaningful change, instead of using 0.2 (the commonly used smallest worthwhile change multiplier) I decided to use a moderate change (0.6), since 0.2 is such a small fraction of the between subject SD.
For info on these two values, I covered them in a blog post last week using Python and a paper Anthony Turner and colleagues wrote.


change_standards <- cmj %>%
  group_by(name) %>%
  mutate(test_id = row_number()) %>%
  filter(test_id < 3) %>%
  select(name, test_id, rel_con_peak_power) %>%
  pivot_wider(names_from = test_id,
              names_prefix = "test_",
              values_from = rel_con_peak_power) %>%
  mutate(diff = test_2 - test_1) %>%
  ungroup() %>%
  summarize(TEM = sd(diff) / sqrt(2),
            moderate_change = 0.6 * sd(c(test_1, test_2)))

Building the Shiny App

In the user interface, I first create my sidebar panel, allowing the user to select the position group of interest. You’ll notice that this sidebar panel is not within the tab panels, which is why it stands alone and allows us to select a position group that will be retained across all tabs.
Next, I set up 2 tabs. Notice that in the first tab (testDay) I include a select input, to allow the user to select the date of interest. In the selected argument I tell shiny to always select the max(cmj$date) so that the most recent session is always shown to the user.
The server is pretty straight forward. I commented out where each tab data is built. Basically, it is just taking the user specified information and performing simple data filtering and then ggplot2 charts to provide us with the relevant information.
On the testDay plot, we use the meaningful change to shade the region around 0 in grey and we use the TEM around the athlete’s observed performance on a given day to specify the amount of error that we might expect for the test.
One the Player Time Series plot we have the athlete’s average line and ±1 SD lines to accompany their data, with points changing color when the week-to-week change exceeds out meaningful change.

### Shiny App -----------------------------------------------------------------------------

## Set up user interface

ui <- fluidPage(
  
  ## set title of the app
  titlePanel("Team CMJ Analysis"),
  
  ## create a selection bar for position group that works across all tabs
  sidebarPanel(
    selectInput(inputId = "position",
                label = "Select Position Group:",
                choices = unique(cmj$position),
                selected = "Backs",
                multiple = FALSE),
    width = 2
  ),
  
  ## set up 2 tabs: One for team daily analysis and one for player time series
  tabsetPanel(
    
    tabPanel(title = "testDay",
             
             selectInput(inputId = "date",
                         label = "Select Date:",
                         choices = unique(cmj$date)[-1],
                         selected = max(cmj$date),
                         multiple = FALSE),
             
             mainPanel(plotOutput(outputId = "day_plt", width = "100%", height = "650px"),
                       width = 12)),
    
    tabPanel(title = "Player Time Series",
             
             mainPanel(plotlyOutput(outputId = "player_plt", width = "100%", height = "700px"),
                       width = 12))
  )
  
)


server <- function(input, output){
  
  ##### Day plot tab ####
  ## day plot data
  day_dat <- reactive({
    
    d <- cmj %>%
      group_by(name) %>%
      mutate(change_power = rel_con_peak_power - lag(rel_con_peak_power)) %>%
      filter(date == input$date,
             position == input$position)
    
    d
    
  })
  
  ## day plot
  output$day_plt <- renderPlot({ day_dat() %>%
      ggplot(aes(x = reorder(name, change_power), y = change_power)) +
      geom_rect(aes(ymin = -change_standards$moderate_change, ymax = change_standards$moderate_change),
                xmin = 0,
                xmax = Inf,
                fill = "light grey",
                alpha = 0.6) +
      geom_hline(yintercept = 0) +
      geom_point(size = 4) +
      geom_errorbar(aes(ymin = change_power - change_standards$TEM, ymax = change_power + change_standards$TEM),
                    width = 0.2,
                    size = 1.2) +
      theme(axis.text.x = element_text(angle = 60, vjust = 1, hjust = 1),
            axis.text = element_text(size = 16, face = "bold"),
            axis.title = element_text(size = 18, face = "bold"),
            plot.title = element_text(size = 22)) +
      labs(x = NULL,
           y = "Weekly Change",
           title = "Week-to-Week Change in Realtive Concentric Peak Power")
    
  })
  
  ##### Player plot tab ####
  ## player plot data
  
  player_dat <- reactive({
    
    d <- cmj %>%
      group_by(name) %>%
      mutate(avg = mean(rel_con_peak_power),
             sd = sd(rel_con_peak_power),
             change = rel_con_peak_power - lag(rel_con_peak_power),
             change_flag = ifelse(change >= change_standards$moderate_change | change <= -change_standards$moderate_change, "Flag", "No Flag")) %>%
      filter(position == input$position)
    
    d
  })
  
  ## player plot
  output$player_plt <- renderPlotly({
    
    plt <- player_dat() %>%
      ggplot(aes(x = date, y = rel_con_peak_power, label = change)) +
      geom_rect(aes(ymin = avg - sd, ymax = avg + sd),
                xmin = 0,
                xmax = Inf,
                fill = "light grey",
                alpha = 0.6) +
      geom_hline(aes(yintercept = avg - sd),
                 color = "black",
                 linetype = "dashed",
                 size = 1.2) +
      geom_hline(aes(yintercept = avg + sd),
                 color = "black",
                 linetype = "dashed",
                 size = 1.2) +
      geom_hline(aes(yintercept = avg), size = 1) +
      geom_line(size = 1) +
      geom_point(shape = 21,
                 size = 3,
                 aes(fill = change_flag)) +
      facet_wrap(~name) +
      scale_fill_manual(values = c("red", "black", "black")) +
      theme(axis.text = element_text(size = 13, face = "bold"),
            axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
            plot.title = element_text(size = 18),
            strip.background = element_rect(fill = "black"),
            strip.text = element_text(size = 13, face = "bold"),
            legend.position = "none") +
      labs(x = NULL,
           y = NULL,
           title = "Relative Concentric Peak Power")
    
    ggplotly(plt)
    
  })
  
  
}



shinyApp(ui, server)