Category Archives: Sports Science

Force Decks – Force Plate Shiny Dashboard

Last week, two of the data scientists at Vald Performance, Josh Ruddy and Nick Murray, put out a free online tutorial on how to create a force plate reports using R with data from their Force Decks software.

It was a nice tutorial to give an overview of some of the power behind ggplot2 and the suite of packages that come with tidyverse. Since they made the data available (in the link above), I decided to pull it down and put together a quick shiny app for those that might be interested in extending the report to an interactive web app.

This isn’t the first time I’ve build a shiny app for the blog using force plate data. Interested readers might want to check out my post from a year ago where I built a shiny interactive report for force-velocity profiling.

You can watch a short preview of the end product in the below video link and the screen shots below the link show a static view of what the final shiny App will look like.

A few key features:

  1. App always defaults to the most recent testing day on the testDay tab.
  2. The user can select the position group at the top and that position group will be maintained across all tabs. For example, if you select Forwards, when you switch between tabs one and two, forwards will always be there.
  3. The time series plots on the Player Time Series tab are done using plotly, so they are interactive, allowing the user to hover over each test session and see the change from week-to-week in the tool tip. When the change exceeds the meaningful change, the point turns red. Finally, because it is plotly, the user can slice out specific dates that they want to look at (as you can see me do in the video example), which comes in handy when there are a large number of tests over time.

All code and data s accessible through my GitHub page.

vald_shiny_app

Loading and preparing the data

  • I load the data in using read.csv() and file.choose(), so navigate to wherever you have the data on your computer and select it.
  • There is some light cleaning to change the date in to a date variable. Additionally, there were no player positions in the original data set, so I just made some up and joined those in.

### packages ------------------------------------------------------------------
library(tidyverse)
library(lubridate)
library(psych)
library(shiny)
library(plotly)

theme_set(theme_light())

### load & clean data ---------------------------------------------------------
cmj <- read.csv(file.choose(), header = TRUE) %>%
  janitor::clean_names() %>%
  mutate(date = dmy(date))

player_positions <- data.frame(name = unique(cmj$name),
                               position = c(rep("Forwards", times = 15),
                                            rep("Mids", times = 15),
                                            rep("Backs", times = 15)))

# join position data with jump data
cmj <- cmj %>%
  inner_join(player_positions)

 

Determining Typical Error and Meaningful Change

  • In this example, I’ll just pretend as if the first 2 sessions represented our test-retest data and I’ll work from there.
  • Typical Error Measurement (TEM) was calculated as the standard deviation of differences between test 1 and 2 divided by the square root of 2.
  • For the meaningful change, instead of using 0.2 (the commonly used smallest worthwhile change multiplier) I decided to use a moderate change (0.6), since 0.2 is such a small fraction of the between subject SD.
  • For info on these two values, I covered them in a blog post last week using Python and a paper Anthony Turner and colleagues wrote.

change_standards <- cmj %>%
  group_by(name) %>%
  mutate(test_id = row_number()) %>%
  filter(test_id < 3) %>%
  select(name, test_id, rel_con_peak_power) %>%
  pivot_wider(names_from = test_id,
              names_prefix = "test_",
              values_from = rel_con_peak_power) %>%
  mutate(diff = test_2 - test_1) %>%
  ungroup() %>%
  summarize(TEM = sd(diff) / sqrt(2),
            moderate_change = 0.6 * sd(c(test_1, test_2)))

Building the Shiny App

  • In the user interface, I first create my sidebar panel, allowing the user to select the position group of interest. You’ll notice that this sidebar panel is not within the tab panels, which is why it stands alone and allows us to select a position group that will be retained across all tabs.
  • Next, I set up 2 tabs. Notice that in the first tab (testDay) I include a select input, to allow the user to select the date of interest. In the selected argument I tell shiny to always select the max(cmj$date) so that the most recent session is always shown to the user.
  • The server is pretty straight forward. I commented out where each tab data is built. Basically, it is just taking the user specified information and performing simple data filtering and then ggplot2 charts to provide us with the relevant information.
  • On the testDay plot, we use the meaningful change to shade the region around 0 in grey and we use the TEM around the athlete’s observed performance on a given day to specify the amount of error that we might expect for the test.
  • One the Player Time Series plot we have the athlete’s average line and ±1 SD lines to accompany their data, with points changing color when the week-to-week change exceeds out meaningful change.
### Shiny App -----------------------------------------------------------------------------

## Set up user interface

ui <- fluidPage(
  
  ## set title of the app
  titlePanel("Team CMJ Analysis"),
  
  ## create a selection bar for position group that works across all tabs
  sidebarPanel(
    selectInput(inputId = "position",
                label = "Select Position Group:",
                choices = unique(cmj$position),
                selected = "Backs",
                multiple = FALSE),
    width = 2
  ),
  
  ## set up 2 tabs: One for team daily analysis and one for player time series
  tabsetPanel(
    
    tabPanel(title = "testDay",
             
             selectInput(inputId = "date",
                         label = "Select Date:",
                         choices = unique(cmj$date)[-1],
                         selected = max(cmj$date),
                         multiple = FALSE),
             
             mainPanel(plotOutput(outputId = "day_plt", width = "100%", height = "650px"),
                       width = 12)),
    
    tabPanel(title = "Player Time Series",
             
             mainPanel(plotlyOutput(outputId = "player_plt", width = "100%", height = "700px"),
                       width = 12))
  )
  
)


server <- function(input, output){
  
  ##### Day plot tab ####
  ## day plot data
  day_dat <- reactive({
    
    d <- cmj %>%
      group_by(name) %>%
      mutate(change_power = rel_con_peak_power - lag(rel_con_peak_power)) %>%
      filter(date == input$date,
             position == input$position)
    
    d
    
  })
  
  ## day plot
  output$day_plt <- renderPlot({ day_dat() %>%
      ggplot(aes(x = reorder(name, change_power), y = change_power)) +
      geom_rect(aes(ymin = -change_standards$moderate_change, ymax = change_standards$moderate_change),
                xmin = 0,
                xmax = Inf,
                fill = "light grey",
                alpha = 0.6) +
      geom_hline(yintercept = 0) +
      geom_point(size = 4) +
      geom_errorbar(aes(ymin = change_power - change_standards$TEM, ymax = change_power + change_standards$TEM),
                    width = 0.2,
                    size = 1.2) +
      theme(axis.text.x = element_text(angle = 60, vjust = 1, hjust = 1),
            axis.text = element_text(size = 16, face = "bold"),
            axis.title = element_text(size = 18, face = "bold"),
            plot.title = element_text(size = 22)) +
      labs(x = NULL,
           y = "Weekly Change",
           title = "Week-to-Week Change in Realtive Concentric Peak Power")
    
  })
  
  ##### Player plot tab ####
  ## player plot data
  
  player_dat <- reactive({
    
    d <- cmj %>%
      group_by(name) %>%
      mutate(avg = mean(rel_con_peak_power),
             sd = sd(rel_con_peak_power),
             change = rel_con_peak_power - lag(rel_con_peak_power),
             change_flag = ifelse(change >= change_standards$moderate_change | change <= -change_standards$moderate_change, "Flag", "No Flag")) %>%
      filter(position == input$position)
    
    d
  })
  
  ## player plot
  output$player_plt <- renderPlotly({
    
    plt <- player_dat() %>%
      ggplot(aes(x = date, y = rel_con_peak_power, label = change)) +
      geom_rect(aes(ymin = avg - sd, ymax = avg + sd),
                xmin = 0,
                xmax = Inf,
                fill = "light grey",
                alpha = 0.6) +
      geom_hline(aes(yintercept = avg - sd),
                 color = "black",
                 linetype = "dashed",
                 size = 1.2) +
      geom_hline(aes(yintercept = avg + sd),
                 color = "black",
                 linetype = "dashed",
                 size = 1.2) +
      geom_hline(aes(yintercept = avg), size = 1) +
      geom_line(size = 1) +
      geom_point(shape = 21,
                 size = 3,
                 aes(fill = change_flag)) +
      facet_wrap(~name) +
      scale_fill_manual(values = c("red", "black", "black")) +
      theme(axis.text = element_text(size = 13, face = "bold"),
            axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
            plot.title = element_text(size = 18),
            strip.background = element_rect(fill = "black"),
            strip.text = element_text(size = 13, face = "bold"),
            legend.position = "none") +
      labs(x = NULL,
           y = NULL,
           title = "Relative Concentric Peak Power")
    
    ggplotly(plt)
    
  })
  
  
}



shinyApp(ui, server)

Accessing Garmin Data For Building Your Own Running Report

In our most recent screen cast (TidyX 34) Ellis Hughes and I discussed how to create your own Garmin running report from the raw data collected on your watch.

You might be asking yourself, “Where did they get the raw data from?”

Therefore, I figured I’d put together a short tutorial on accessing raw Garmin data from your account.

  1. First log into your Garmin Connect account. Go HERE.
  2. Once you are signed in, you’ll see a dashboard
  3. Click the Activities drop down and select All Activities to see your workouts
  4. Click on the activity that you want to obtain raw data for
  5. Once inside the activity, click the gear in the upper right of the screen and select Export to TCX

That’s it! Now you have the raw data for that session and you are ready to create your own running reports!

Acute:Chronic Workload & Our Research

Some research that myself and a few colleagues have worked on for the past year (and discussed for far longer than that) regarding our critiques of the acute-chronic workload model for sports injury have been recently published.

It was a pleasure to collaborate with this group of researchers and I learned a lot throughout the process and hopefully others will learn a lot when they read our work.

Below are the papers that I’ve been a part of:

  1. Bornn L, Ward P, Norman D. (2019). Training schedule confounds the relationship between Acute:Chronic Workload Ratio and Injury. A causal analysis in professional soccer and American football. Sloan Sports Analytics Conference Paper.
  2. Impellizzeri F, Woodcock S, McCall A, Ward P, Coutts AJ. (2020). The acute-crhonic workload ratio-injury figure and its ‘sweet spot’ are flawed. SportRxiv Preprints.
  3. Impellizzeri FM, Ward P, Coutts AJ, Bornn L, McCall A. (2020). Training Load and Injury Part 1: The devil is in the detail – Challenges to applying the current research in the training load and injury field. Journal of Orthopedic and Sports Physical Therapy, 50(10): 577-584.
  4. Impellizzeri FM, Ward P, Coutts AJ, Bornn L, McCall A. (2020). Training Load and Injury Part 2: Questionable research practices hijack the truth and mislead well-intentioned clinicians. Journal of Orthopedic and Sports Physical Therapy, 50(10): 577-584.
  5. Impellizzeri FM, McCall A, Ward P, Bornn L, Coutts AC. (2020). Training load and its role in injury prevention, Part 2: Conceptual and Methodologic Pitfalls. Journal of Athletic Training, 55(9). 893-901.

Many will argue and say, “Who cares? What’s the big deal if there are issues with this research? People are using it and it is making them think about training load and it is increasing the conversations about training load within sports teams.”

I understand this argument to a point. Having been in pro sport for 7 years now, I can say that anything which increase conversation about training loads, how players are (or are not) adapting, and the potential role this all plays in non-contact injury and game day performance is incredibly useful. That being said, to make decisions we need to have good/accurate measures. Simply doing something for the sake of increasing the potential for conversation is silly to me. It is the same argument that gets made for wellness questionnaires (which I have also found little utility for in the practical environment).

When we measure something it means we are assigning a level of value to it. There is some amount of weighting we apply to that measurement within our decision making process. Even if we are under the belief that collecting the metric is solely for the purpose of increasing the opportunity to have a conversation with a player or coach. In the back of our minds we are still thinking, “Jeez, but his acute-chronic workload ratio was 2.2 today” or “Gosh, I don’t know. He did put an 8 down (out of 10) for soreness this morning”.

Of course challenging these ideas doesn’t mean we sit on our hands and do nothing. Taking simple training load measures (GPS, Accelerometers, Heart Rate, etc.) and applying basic logic about reasonably safe training load increases from week-to-week, doing some basic analysis to understand injury risks and rates within your sport (how they differ by position, age, etc), and identifying players that might be at higher risk of injury to begin with (IE, higher than the baseline risk) and having a more conservative approach with their progression in training can go a long way. Doing something simple like that, doing well, and creating an easy way to report said information to the coach and player can help increase the chance for more meaningful conversation without using measures that might otherwise give a flawed sense of certainty around injury risk.

Regardless of our work, people will use what they want to use and what they are (or have been) most comfortable with in practice. However, that shouldn’t deter us from challenging our processes, critiquing methodologies, and trying to better understand sport, training, physiology, and athlete health and wellness.

R Tips & Tricks: Building a Shiny Training Load Dashboard

In TidyX Episode 19 we discussed a way of building a dashboard with the {formattable} package. The dashboard included both a data table and a small visual using {sparkline}. Such a table is great when you need to make a report for a presentation but there are times when you might want to have something that is more interactive and flowing. In the sports science setting, this often comes in the form of evaluating athlete training loads. As such, I decided to spin up a quick {shiny} web app to show how easy it is to create whatever you want without sinking a ton of money into an athlete management system.

The code is available on my GITHUB page.

Packages & Custom Functions

Before doing anything, always start by loading the packages you will need for your work. In this tutorial, we will using the {tidyverse} and {shiny} packages, so be sure to install them if you haven’t already. I also like to set my plot theme to classic so that I get rid of grid lines for the {ggplot2} figures that I create.

Finally, I also wrote a custom function for calculating a z-score. This will come in handy when we go to visualize our data.

# custom function for calculating z-score
z_score <- function(x){
  z = (x - mean(x, na.rm = T)) / sd(x, na.rm = T)
  return(z)
}

 

Data

Next we simulate a bunch of fake data for a basketball team so that we have something to build our {shiny} app against. We will simulate total weekly training loads for 10 athletes across three different positions (fwd, guard, center), for a 3 week pre-season and a 15 week in-season.

 

set.seed(55)
athlete <- rep(LETTERS[1:10], each = 15)
position <- rep(c("fwd", "fwd", "fwd", "fwd", "guard", "guard", "guard", "center", "center", "center"), each = 15)
week <- rep(c("pre_1", "pre_2", "pre_3", 1:12), times = 10)
training_load <- round(rnorm(n = length(athlete), mean = 1500, sd = 350), 1)

df <- data.frame(athlete, position, week, training_load)
df$flag <- factor(df$flag, levels = c("high", "normal", "low"))
df$week <- factor(df$week, levels = c("pre_1", "pre_2", "pre_3", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12")) 

df %>% head()

The first few rows of the data look like this:

Shiny App

There are two components to building a {shiny} app:

1) The user interface, which defines the input that the user specifies on the web app.
2) The server, which reacts to what the user does and then produces the output that the user sees.

The user interface I will define as tl_ui (training load user interface). I start by creating a fluid page (since the user is going to be defining what they want). I set the parameters of what the user can control. Here, I provide them with the ability to select the position group of interest and then the week(s) of training they would like to see in their table and plot.

The server, which I call tl_server (training load server), is a little more involved as we need to have it take an input from the user interface , get the correct data from our stored data frame, and then produce an output. This part can be a bit involved. Below you can see the code and I’ll try and articulate a few of the main things that are taking place.

1) First, I get the data for the visualization the user wants. I’m using our simulated data and I calculate the z-score for each individual athlete, using the custom function I wrote. After that I create my flag (±1 SD from the individual athlete’s mean) and then I finally filter the data specific to the inputs provided in the user interface. This last part is critical. If I filter before applying my z-score and flags, then the mean and SD used to calculate the z-score will only be relative to the data that has been filtered down. This probably isn’t what we want. Rather, we will want to calculate our z-score using all data, season-to-date. So we retain everything and perform our calculation and then we filter the data.

2) After getting the data, I build my plot using {ggplot2}.

3) Then I create another data set specific to the table weeks of training selected by the user. This time, we will retain the raw values (non-standardized) for each athlete as the user may want to see raw values in a table. Notice that I also pivot that data using the pivot_wider() function as the data is currently stored in a long format with each row representing a new week for the athlete. Instead, I’d like the user to see this data across their screen, with each week representing a new column. As the user selects new weeks of data to visualize our server will tell {shiny} to populate the next column for that athlete.

4) Once I have the data we need, I simply render the table.

Those 2 steps complete the building of our {shiny} app!

Finally, the last step is to run the {shiny} app using the shinyApp() function which takes two arguments, the name of your user interface (tl_ui) and the name of your server (tl_server). This will open up a shiny app in a new window within R Studio that you can then open in your browser.

Visually, you can see what the app looks like below. If you want to see it in action, aside from getting my code and running it yourself, I put together this short video >> R Tips & Tricks – Training Load Dashboard

Conclusion

As you can tell, it is pretty easy to build and customize your own {shiny} app. I’ll admit, the syntax for {shiny} can get a bit frustrating but once you get the hang of it, like most things, it isn’t that bad. Also, unless you are doing really complex tasks, most of the syntax for something as simple as a training load dashboard wont be too challenging. Finally, such a quick and easy method not only allows you to customize things exactly as you want them, but it also saves you money from having to purchase an expensive athlete management system.

As always, you can obtain the code for this tutorial on my GITHUB page.

R Tips & Tricks: Force-Velocity-Power Profile Graphs in R Shiny

A colleague of mine was asking about how he could produce plots of force-velocity-power profiles for his coaches based on their Gymaware testing. Rather than plotting static plots for this stuff it is sometimes easier to build out a nice shiny app so that the coaches can interact with it or the practitioner can quickly change between players or position groups when giving a presentation to the staff.

All code and data is available at my GITHUB page (R Script and Data).

This tutorial will cover:

1) Building polynomial regression to represent the average team trend
2) Iterative approaches to building static plots
3) Iterative approaches to building Shiny web apps

This tutorial has a number of different iterations of plots and web apps so working through the R-code on your own is advised so you can see how every step is performed.

Data

After loading the two required packages, {tidyverse} and {shiny}, we load in the data set and we see that it consists of Force, Power, and Velocity values across 5 different external loads for 3 different athletes:

If you want to use the R-script to produce reports for yourself going forward just ensure that your data has the above columns (named the same way, since R is case-sensitive). If you have data with different column names, your two choices are: (1) change my code to match the column names of your data; or, (2) once you pull the data into R change the columns names in your code to match mine.

Average Trend Line (2nd Order Polynomial)

Eventually, we are going to want to compare our athletes to the team average (or position group average if your sport is more heterogeneous). This type of data is often modeled as a. 2nd order polynomial regression. Thus, we will build this type of regression to predict both Velocity and Power from Force. Once I have these two regressions built I can create a data frame that consists of a sequence of Force values (from the minimum to the maximum observed in the team’s data) and predict the velocity and power at each unit of force.

fit_velo <- lm(Velocity ~ poly(Force, 2), data = fv_profile)
fit_power <- lm(Power ~ poly(Force, 2), data = fv_profile)

avg_tbl <- data.frame(Force = seq(from = min(fv_profile$Force),
                                  to = max(fv_profile$Force),
                                  by = 0.5))

avg_tbl$Velocity_Avg <- predict(fit_velo, newdata = avg_tbl)
avg_tbl$Power_Avg <- predict(fit_power, newdata = avg_tbl)
colnames(avg_tbl)[1] <- "Force_Grp"

 

Static Plots

I’ll walk through a few iterations of static plots. See the GITHUB code to walk through how these were produced.

1) All athletes with an average team trend line

This plot gives us a sense for the trend in Velocity as it relates to increasing amounts of force. Below you will see one with a solid trend line and one with a dashed trend line. Feel free to use which ever one you prefer.

2) Team trend for Velocity and Power

The common way this type of data is visualized in the sports science literature it is a bit tricky in R because it requires a dual y-axis. To obtain this, within my ggplot call I shrink Power down to a scale more similar to Velocity (the main y-axis) by dividing it by 1000. Then when I call the second y-axis with the sec_axis() function I multiply Power by 1000 to put it back on its normal scale.

3) Accounting for individuals

The above plots are a look at the entire team (in this case only 3 athletes). However, we may want to look at the individuals more explicitly. As such, we build four plots to show individual differences:

1) All athletes all on the same plot with their corresponding individualized trend lines (NOTE: if you have a lot of athletes, this plot can get pretty busy and ultimately become useless).
2) Plot each individual as a facet to look at the athletes separately.
3) Create the same plot as #2 but add in the group average trend line (which we created using our 2nd order polynomial regression) to allow us to compare each athlete to the groupx.
4) Plot each individual as a facet with velocity and power on separate y-axes.

Shiny App Development

The still figures above are nice but making things interactive is much more useful. I have four {shiny} web iterations that we can go through. Again, using the R code while reading this will help you understand what is going on under the hood. Additionally, you’ll want to run these in R so that R can open up the webpage on your computer and allow you to play with the app. Below is just still shots of each app iteration.

1) Version 1: Independent Player Plots

This version allows you to select one player at a time.

2) Version 2: Add Players to Facets

This version lets you select however many players you want and it will add them in as facets. This is a useful app if you are presenting to the staff and want to select or de-select several players to show how they compare to each other.

3) Version 3: Same as Version 2 but add Power to the Plot on a second y-axis

4) Version 4: Combine all plot details

Version 4 combines everything from this tutorial in one single web app. You can select the  force-velocity-power profile for individual athletes (added to the plot as facets) and see the team average trend line (added to the plot as a dashed line) for velocity and power, to allow you to make comparisons between each player and to the group average.

Conclusion

{tidyverse} makes it incredibly easy to manipulate data and quickly iterate plots to our liking while {shiny} offers an easy way for us to turn our plots into an interactive webpage.

Again, for the code and data see my GITHUB page (R Script and Data).