Author Archives: Patrick

A Simple Approach to Analyzing Athlete Data in Applied Sports Science

Intro

Evaluating whether an athlete has or has not improved in some key performance indicator is critical to understanding the success of a prescribed training or rehabilitation program. In the applied setting, practitioners are faced with ­N = 1 decisions as they are training or rehabilitating individual athletes, each of whom is unique in their own way. As such, tests that allow practitioners to understand these individual improvements are imperative to quantifying the training process.

The analysis of athlete testing data first requires an understanding of what the test is measuring (whether it is valid or not) and the amount of noise/error within the test (whether the test is reliable or not). Tests that are overly noisy make it challenging for practitioners to reliably know whether or not changes exhibited by the athlete are due to real performance improvement, measurement error (e.g., issues with the test itself) or biological variation. Approaches to analyzing test-retest data to evaluate typical error measurement (TEM) and smallest worthwhile change (SWC) have been previously discussed by authors such as Hopkins1, Swinton2, and Turner3. Recently, my friend and colleague, Shaun McLaren, wrote a blog post on understanding statistics when interpreting individualized testing data. Such approaches are important in the applied setting as the last thing a practitioner or clinician wants to do is report inaccurate information regarding an athlete’s current physical state to the coach or management. From a medical/return-to-play standpoint, such information is important for ensuring that the athlete is making progress and meeting certain benchmarks to ensure a safe return from injury.

The analytical approaches Shaun discussed are relatively easy to perform, and interested readers can download Excel sheets that will automatically calculate these measures and only require the practitioner to provide test-retest data. My aim in this blog post is to walk though similar statistical approaches using the coding language R and build a function that will automatically calculate these metrics once the practitioner provides their data (analysis for this blog post was built of of the methods proposed by Swinton and Colleagues2, who provide similar methods in excel on the article’s webpage).

Simulating Data

First, we need to create some data to play with. I’ll simulate two different data sets:

      • Data Set 1: Test-Retest Data
        • This data set will serve as our test-retest trial data. We will use this data set to calculate measures to get a sense for how noisy the test is and calculate measures such as TEM and SWC. For example, let’s say that this test-retest trial is something like a simple vertical jump test. We want to have the athletes perform the test, take a rest period, and then perform the test again. We will then calculate how much error there is in the test.
      • Data Set 2: Training Intervention Data
        • Once we’ve established TEM and SWC, we will simulate a second data set that represents a group of athletes performing an experimental training intervention (strength training only) and another group of athletes performing a control condition (endurance training only). We will write a function to evaluate the responses from this data to understand how successful the intervention truly was.

Test-Retest Data Simulation

Our test-retest data will be a simulation of a vertical jump test for 20 athletes.


### Load packages

library(dplyr)
library(ggplot2)
library(reshape)

### Simulate data

set.seed(2018)
subject <- LETTERS[1:20]
group <- rep(c("experimental", "control"), each = 10)
test <- c(round(rnorm(n = 10, mean = 25, sd = 4),1), round(rnorm(n = 10, mean = 25, sd = 3), 1))
retest <- c(round(rnorm(n = 10, mean = 25, sd = 5),1), round(rnorm(n = 10, mean = 24, sd = 4), 1))

reliability.data <- data.frame(subject, test, retest)
head(reliability.data)

  subject test retest
1       A 23.3   31.3
2       B 18.8   26.3
3       C 24.7   26.3
4       D 26.1   33.9
5       E 31.9   18.9
6       F 23.9   23.8

Two metrics we are interested in obtaining from the data are TEM and SWC.

  • Typical error of measurement (TEM) is calculated as the standard deviation of the difference between test-retest scores divided by the square root of 2.
    • TEM = sd(Difference) / sqrt(2)
  • Smallest worthwhile change (SWC) is calculated as the standard deviation of Test 1 multiplied by an effect size of interest. Hopkins and Batterham4 recommend this effect size to be 0.2, as 0.2 represents the “smallest worthwhile effect” according to Jacob Cohen.
    • SWC = sd(Test1 Scores) * magnitude threshold

Note on the magnitude threshold: With a very homogeneous group of athletes the standard deviation, and ensuing SWC, can be very small, perhaps so small that it is almost meaningless (Buchheit5) . However, I encourage practitioners to determine the effect size of interest based on the magnitude of change that they feel would be meaningful to worry about or meaningful to report to a coach. This might come down to the type of test being performed or the age/experience of the athlete. I don’t think it is as easy as simple saying “0.2 is always our benchmark.” Sometimes we may want to have a larger magnitude of interest (perhaps 0.8, 1.0, or 1.2). To be consistent with the scientific literature, I’ll use 0.2 in for this example, however, in the test-retest function below, I allow the practitioner to choose the magnitude threshold that is most important to them.


######### Test-Retest Function ######################
#####################################################

Test_Retest <- function(test1, test2, magnitude.threshold){
	
	require(dplyr)
	
	# combine the vectors into a dataset
	dataset <- data.frame(test1, test2)
	
	# calculate difference
	dataset$Diff <- with(dataset, test2-test1)
	
	# Calculate Mean & SD
	stats <- as.data.frame(dataset %>%
	summarize(PreTest.Mean = round(mean(test1, na.rm =T),2),
		PreTest.SD = round(sd(test1, na.rm = T),2),
		PostTest.Mean = round(mean(test2, na.rm = T), 2),
		PostTest.SD = round(sd(test2, na.rm = T), 2),
		Mean.Difference = round(mean(Diff, na.rm = T), 2),
		SD.Difference = round(sd(Diff, na.rm = T), 2)))
		
	# Calculate TEM
	TEM <- sd(dataset$Diff, na.rm = 2)/sqrt(2)
	
	# Calculate SWC
	swc <- magnitude.threshold*sd(test1)
	
	# Function output
	list(SummaryStats = stats, TEM = round(TEM,2), SWC = round(swc, 3))
		
}

With the function loaded, we can now supply it with the data from our simulated test-retest trial. All that is required are three inputs:

  1. A vector representing the the scores for test 1.
  2. A vector representing the scores for test 2 (the re-test).
  3. The magnitude of threshold of interest. (Again, in this example I’ll use 0.2, to represent the smallest worthwhile change. Feel free to change this to a different magnitude threshold, such as 0.8 or 1.2, and see how it effects the results.)

test.retest.results <- Test_Retest(test1 = reliability.data$test, test2 = reliability.data$retest, magnitude.threshold = 0.2)

test.retest.results

Looking at the output, we see that the results are returned as a list with three elements:

  1. Summary statistics of both tests and the difference between tests
  2. The TEM
  3. The SWC

This type of list format is useful if you want to call specific parts of the analysis. For example, if I need the TEM to be included downstream, in a later analysis, I can simply call it by typing:


test.retest.results$TEM
[1] 4.83

One thing we may notice from the output of our function is that the error for this test is rather large, relative to the SWC. This could potentially be an issue when attempting to interpret future results for this test, given the error is so large. In this case, we may want to go back to the drawing board with our test and try to figure out a way to minimize the test error (or potentially consider using a different test). Alternatively, using this test would mean that we need to have a rather large change in the athlete’s performance to be certain that improvement the athlete had was “real.”

Training Intervention Simulation Data

The training data that we’ll simulate will have baseline vertical jump scores and follow-up vertical jump scores at 8 weeks. Group 1 will only perform strength training while Group 2 will only perform endurance training.


### Simulate data

set.seed(2018)
subject <- LETTERS[1:20]
group <- rep(c("experimental", "control"), each = 10)
baseline <- c(round(rnorm(n = 10, mean = 24, sd = 3),1), round(rnorm(n = 10, mean = 24, sd = 3), 1))
post.intervention <- c(round(baseline[1:10] + rnorm(n = 10, mean = 8, sd = 5), 1), round(rnorm(n = 10, mean = 27, sd = 5), 1))

study.data <- data.frame(subject, group, baseline, post.intervention)
head(study.data)

  subject        group baseline post.intervention
1       A experimental     22.9              35.1
2       B experimental     20.1              31.0
3       C experimental     21.4              30.3
4       D experimental     27.2              33.0
5       E experimental     23.6              31.2
6       F experimental     27.1              30.5

tail(study.data)

   subject   group baseline post.intervention
15       O control     23.4              30.4
16       P control     25.4              24.5
17       Q control     21.2              17.7
18       R control     32.2              30.7
19       S control     25.0              26.6
20       T control     22.1              32.4

Next, we create a function called outcome, which takes the following eight inputs:

  1. A vector of baseline scores
  2. A vector of post-test scores (follow-up scores)
  3. A vector denoting which subjects belong to each of the groups
  4. A vector of subject IDs
  5. The TEM established from our test-retest trial above
  6. The SWC established from our test-retest trial above
  7. The number of samples in our test-retest trial (Reliability.N = 20)
  8. The confidence interval we are interested in. For this example I’ll use 95%. However, feel free to change this value and see how it influences the results

outcome <- function(baseline.test, post.test, groups, subject.IDs, TEM, SWC, Reliability.N, Conf.Level){
	
	# Combine the vecotors into a data set
	df <- data.frame(subject.IDs, groups, baseline.test, post.test)
	
	# True Baseline Score Calculation
	
	df$LowCI.baseline.true <- round(baseline.test - qt(p = (1-Conf.Level)/2, df = Reliability.N - 1, lower.tail = F)*TEM, 2)
	
	df$HighCI.baseline.true <- round(baseline.test + qt(p = (1-Conf.Level)/2, df = Reliability.N - 1, lower.tail = F)*TEM, 2)
	
	# create a difference score
	df$Pre.Post.Diff <- with(df, post.test - baseline.test)
	
	# create confidence intervals around the difference score
	df$LowCI.Diff <-  round(df$Pre.Post.Diff - qt(p = (1-Conf.Level)/2, df = Reliability.N - 1, lower.tail = F) * sqrt(2) * TEM, 2)
	df$HighCI.Diff <-  round(df$Pre.Post.Diff + qt(p = (1-Conf.Level)/2, df = Reliability.N - 1, lower.tail = F) * sqrt(2) * TEM, 2)
	
	# Summary Stats of Change
	
	Pre.Post.Summary.Stats <- df %>% 
					group_by(groups) %>%
					summarize(
						Mean = mean(Pre.Post.Diff),
						SD = sd(Pre.Post.Diff))
					
	# SD of the response
	diff <- as.data.frame(Pre.Post.Summary.Stats)
	sd.response <- sqrt(abs(diff[1,3]^2 - diff[2,3]^2))
	
	# Proportion of Response
	
	mean1 <- diff[1,2]
	mean2 <- diff[2,2]
	
	prop.response.group.1 <- ifelse(SWC > 0, 100-pnorm(q = SWC,
		mean = mean1,
		sd = sd.response)*100, pnorm(q = SWC,
		mean = mean1,
		sd = sd.response)*100)
	
	prop.response.group.2 <- ifelse(SWC > 0, 100-pnorm(q = SWC,
		mean = mean2,
		sd = sd.response)*100, pnorm(q = SWC,
		mean = mean2,
		sd = sd.response)*100)
	
	list(Data = df, 
		Outcome.Stats = Pre.Post.Summary.Stats, 
		Stdev.Response = sd.response, 
		Perct.Responders.Group1 = paste(round(prop.response.group.1, 2), "%", sep = ""),
		Perct.Responders.Group2 = paste(round(prop.response.group.2, 2), "%", sep = "")
	)
	
}

Now we are ready to use the outcome function on our simulated intervention data set.


outcome(baseline.test = study.data$baseline, 
		post.test = study.data$post.intervention, 
		groups = study.data$group,
		subject.IDs = study.data$subject, 
		TEM = 4.83, 
		SWC = 0.756,
		Reliability.N = 20, 
		Conf.Level = 0.95)

Similar to our test-retest function, the results are returned as a list. Let’s look at the results in more detail:

  • The first element of the list provides a table of our original data, except we have a few new columns. First we see that we have Low and High Confidence Interval columns (in this case, these columns represent 95% CI, since that is what I specified when I ran the function). These confidence intervals are specific to the baseline test score. They are important for us to consider because when measuring an athlete we can never be truly confident that the performance they produced is their true performance (due to a variety of factors and, in particular, biological variability). Thus, the function uses the TEM from the test-retest trial to calculate the confidence interval around the athletes’ observed baseline scores. Finally, the last three columns provide us with the post-pre score differences and 95% CI around those difference scores for each individual athlete.
  • The second element of list gives us the summary statistics for each group based on how they performed in the trial. In this element, we can see that the experimental group (Group 1, strength training-only group) observed a larger improvement in vertical jump height, on average, following 8 weeks of training, compared to the control group (Group 2, endurance training-only group). TECHNICAL NOTE: R automatically sorted the two groups alphabetically. As such, even though Group 1 (the experimental group) was first in the original data set, it comes out as being “Group 2” in the output.
  • The third element is the standard deviation of individual responses. Hopkins6 suggest that this standard deviation represents the amount that the mean effect of the intervention is seen to vary between individuals. This standard deviation will be used in the fourth and fifth elements to help understand the individual responses observed within groups.
  • The fourth and fifth elements of the list display the percentage of responders to the treatment. This proportion of response is calculated by evaluating the variability in change scores from the intervention (standard deviation of individual responses) and the specified SWC (from our test-retest trial)2. In the case of our simulated data set we see that Group 1 (remember, this is the endurance group, since R organized the data by group alphabetically)  had a lower response than Group 2 (the strength training group).

Wrapping Up

When analyzing data in the applied sport science setting it is important to establish measures such as TEM and SWC so that you can have a higher amount of certainty that athletes are progressing and making true performance improvements. In this blog post, I showed a very simple way to analyze such data while also showing that some basic R coding can be used to produce functions that make our job easier and provide quick results (and quick results are important in the applied setting where decisions between games need to be made in a timely fashion).

Two future considerations:

  1. The training intervention example I provided may not be terribly realistic in many applied sports settings. For example, rarely will a coach allow the staff to separate players into two groups that train in different ways. In a future blog post, I hope to provide some code for analyzing individuals when serial measurements are taken across a season.
  2. I didn’t provide any visualization of the data. Data visualization is not only critical to understanding the data you are analyzing but also important for presenting your data to coaches, managers, and other practitioners. I hope to address data visualization approaches in a future blog post.

References

  1. Hopkins WG, Marshall SW, Batterham AM, Hanin J. (2009). Progressive statistics for studies in sports medicine and exercise science. Med Sci Sports Exer, 41(1): 3-12.
  2. Swinton PA, Hemingway BS, Saunders B, Gualano B, Dolan E. A statistical framework to interpret individual response to intervention: Paving the way for personalized nutrition and exercise prescription. Front Nutr, 5(41): 1-14.
  3. Turner A, Brazier J, Bishop C, Chavda S, Cree J, Read P. (2015). Data analysis for strength and conditioning coaches: Using excel to analyze reliability, differences, and relationships. Strength Cond J, 31(1): 76-83.
  4. Hopkins WG, Batterham AM. (2016). Error rates, decisive outcomes and publication bias with several inferential methods. Sports Med, 46(10): 1563-1573.
  5. Buchheit M. (2014). Monitoring training status with HR measures: Do all roads lead to Rome? Front Phys, 5(73): 1-19.
  6. Hopkins WG. (2015). Individual responses made easy. J Apply Physiol, 118: 1444-1446.

Back to Blogging!

I’m generally not a fan of new year’s resolutions, but I decided to make one this year: get back to blogging.

It’s been a while (a few years, actually!), and I’m excited to put some content down on paper (um, screen). Future blog posts will be directed toward my thoughts on sports science, discussion of research I’ve published, and discussion on research from my colleagues.

I hope to keep these posts relatively short and provide some usable content (such as code for analysis).

Concurrent Training – The Effect of Intensity Distribution

Periodization and planning of training is a topic that fascinates me as I enjoy studying how good coaches structure training and develop athletes. Lots of thoughts exist regarding the best periodization strategy to use (e.g., Linear, Block, Conjugate, Vertical Integration, Undulating, Daily Undulating, Fluid, etc.).

Concurrent training is one approach to structuring a training program where multiple qualities are trained within the same session. Of course, this may present problems where one quality (e.g., strength) may interfere with another quality (e.g., aerobic training) that you are looking to also develop in that session. For more on this issue, referred to as the interference phenomenon, see THIS blog post I wrote about 4 years ago.

A new study by Varela-Sanz and colleagues evaluated the effect of concurrent training between two programs that had equivalent external loads (volume x intensity) but differed in training intensity distribution. This evaluation may provide practitioners with a better understanding of the optimal dose and intensity needed to minimize the interference phenomenon. In team sport athletes, this may be essential as training and developing multiple qualities needed for sport is crucial and the shortened offseason periods can make program planning a challenge.

Study Overview

Subjects: 35 sport science students (30 men / 5 women)
Duration: 8 weeks
Independent Variable: External training load
Dependent Variables:

  • Counter Movement Jump
  • Bench Press (7 – 10 RM was performed and used to estimate 1 RM)
  • Half Squat (7 – 10 RM was performed and used to estimate 1 RM)
  • Max Aerobic Speed (Université de Montréal Track Test)
  • Body Composition (body weight & skinfold measurements)
  • HRV
  • RPE
  • Feeling Scale
  • Training Impulse (TRIMP)

Training Groups

  • Traditional Training Group
    • N = 12
    • This group followed the exercise guidelines recommended by the American College of Sports Medicine (ACSM), which suggests that moderate-to-vigorous intensity aerobic exercise is performed on most days of the week.
  • Polarized Training Group
    • N =12
    • This group followed a polarized training program. Polarized training programs have been recommended for endurance athletes as a method of distributing training intensity. Despite this polarized approach, external load was matched to the Traditional Training Group.
  • Control Group
    • N = 11

Training Program

  • Training Frequency: 3x/week (Mon, Wed, Fri)
    • Monday & Friday sessions were ~120min
    • Wednesday’s session was ~60min
  • Training Set Up
    • Monday/Friday Training
      • Cardiovascular Training
      • Resistance Training
    • Wednesday Training
      • Cardiovascular Training

      Screen Shot 2016-06-19 at 2.41.03 PM

Results

  • No differences for total workload, RPE, TRIMP, or Feeling Score were found between groups over the 8-week period.
  • The traditional training group was the only group to see a decrease in resting HR (both supine and standing) following the training program. No changes in HRV were seen for any group.
  • Both training groups saw improvements in 1RM for the bench press, half squat, and Max Aerobic Speed.
  • The polarized group saw an increase in body weight (without a change in body fat) following the 8-week training program and was still able to maintain their vertical jump abilities.

Practical Applications

I don’t know that this study moves us any closer to understanding the optimal distribution of training intensity when performing a concurrent training program. The polarized group performed easier cardiovascular training on days where they performed resistance training (Monday & Friday) and on Wednesday’s they performed easy cardiovascular training followed by high intensity interval training. The traditional group performed the same training session each day, with the same intensities for the duration of the 8-week program. Despite the differences in intensity distribution, both groups appeared to make improvements so it is really difficult to tell which method may be more beneficial (or perhaps, they are really just the same).

There are a number of things to consider when reading this study:

  • The subjects are not high-level athletes and it is possible that any form of training is going to provide a positive training effect.
  • Resistance training volume was low (they only used two exercises – Bench Press and Half Squat) so we don’t know what would happen if there were more resistance training in the program.
  • The polar training group trained opposite qualities during their training sessions, which is interesting given that a commonly held belief amongst coaches is to try and group similar qualities together in one session rather than mix them (IE, sprinting + heavy strength training or aerobic training + lower intensity resistance training).

Probably the most important thing that I think about with papers like this is that we need to begin to dig down into understanding individual differences. Comparing group means doesn’t really tell us how the individual’s responded and then allow us to make better inference to our own athletes about what sort of outcome we might expect to get when we write a training program. Training is a very individualized process and how someone responds to the program we apply to them is dependent on a number of factors – some that we might be able to measure and quantify and others which we might not be able to measure and quantify (and a few others that we might not even be aware of yet). In the process of evaluating individual differences we may find that some athletes in each group got better, a few stayed the same, and some may have gotten worse. Without understanding these individual differences and then attempting to unpack the deeper question of “why” it will be hard to plan individualized training programs in the future. If we can get to the bottom of how people respond to training and we can start to go down the road of figuring out the factors that influence that response we will start to have a better idea of the impact our training program will have for that athlete, allowing us to make individual adjustments that may lead to more favorable outcomes.

 

 

2016 NBA Draft – How do you put together a winning team?

The 2015-2016 NBA playoffs have just begun meaning 16 fortunate teams are still playing ball while 14 others are preparing for the 2016 Draft and beginning to set up the structure of their team for next season (“There’s always next season”).

The concept of drafting players is an interesting one. So much goes into it – athleticism, physical stature, game smarts, college performance, and the player’s mentality (IE, will they be able to handle the pressure, will they fit in with the guys and have good team chemistry, etc). Recently, Motomura and colleagues (2016) discussed the role the draft can playing in building an NBA franchise. More importantly, they set out to understand whether having more or higher draft picks actually made an NBA team better. They concluded,

“We find that the draft is not necessarily the best road to success. An excellent organization and General Manager better enable teams to succeed even without high draft picks.”

This got me thinking – could we potentially try and understand which teams are “excellent” organizations in terms of selecting players that enjoy success at in the NBA? Additionally, I am really interested in the Philadelphia 76ers. Year after year they always seem to be in the conversation of tanking at the end of the season, in order to increase their chances of obtaining higher round draft picks in the NBA Draft Lottery. In fact, they have been so good at this over the past few seasons that the 2016 season is supposed to the final season of the tanking era in Philadelphia. Unfortunately, their efforts to tank and stock pile great players has not payed off. They seem to have a hard time either:

  1. Selecting good players. If you are going to tank you better not miss on your draft picks!
  2. Developing players or bringing in veteran players who can surround the young stars so that they don’t have to play a high number of minutes their rookie season and carry the team (something also addressed in the Motomura above).

The Data

2011 – 2015 NBA Draft data was obtained from basketball-reference.com.

Aims

  • With 60 picks in the NBA Draft (300 total over the 5 year period) how many players, on average, do teams pick up?
  • What is the average value of players selected in each of the draft number spots?
  • Which teams have been most successful at picking players that added a high amount of value to their team?
  • What is going on in Philly?

Number of Draft Picks

Over the 2011 – 2015 NBA Draft 300 total players have been chosen, with teams averaging 9 players drafted during that period. The 76ers certainly are leading the way, selecting 21 players over this 5 year stretch. (NOTE: You will notice there are 34 teams in the table below. This is because I left in expansion teams and teams that moved from one city to another during this 5 year period. I did this to just represent what took place in the draft between 2011 – 2015).

Screen Shot 2016-04-16 at 8.02.38 PM
What is the value of a draft pick?

Value or Success metrics are often one of the more difficult things to pin down when studying team sport athletes. Lots of things players do can add value to a team without ever making it into the box score (which primarily consists of count metrics). The writers at basketball-references.com display two metrics which I used to quantify a player’s value – Win Shares and Value Over Replacement Player. Both of these metrics are the type of metrics that were born out of Baseball’s Sabermetrics as a way of trying to provide more context to the box score metrics presented to fans everyday on websites or in newspapers. Win Shares is a metric that takes the teams success and divides up credit for that success among the participating players. Value Over a Replacement Player is a metric which projects the player’s value versus a fictitious replacement player. Both of these metrics have limitations and people argue frequently over which is more useful or whether we should use a different metric to represent value (E.g., Player Efficiency Rating or something like +/- or Adjusted +/-. Both of which have their own limitations). I simply chose these metrics because they were readily available and they would provide me with a quick way to represent player value. Any metric one deems important would suffice, though.

To reflect value per pick I summarized the data in a few ways:

  • I binned the picks into groups of ten (Picks 1-10, 11-20, 21-30, 31-40, 41-50, and 51-60). Because I was dealing with a five year period it meant that there would only be 5 picks for each selection (1-60), which wouldn’t provide enough data. Thus, binning it this way helped me group more players together.
  • Since I am using 5 years of data it isn’t really fair to look at something like Win Shares for all of the players, since players who were drafted in 2011 have a much longer time to contribute to their win share compared to a player drafted in 2015 (a rookie). Thus, I reflected Win Shares over Games Played, to attempt to look at each player’s contribution to their teams success relative to the amount of games they participated in.
  • Finally, I added in Minutes Per Game, simply because I wanted to see what the participation differences were between the bins of draft picks.

The data in the below table is the average of each metric for the six different draft pick bins.

Screen Shot 2016-04-17 at 4.29.23 PM

As we would expect (or should expect) there is a monotonic decrease in each of the three metrics as we move from Pick 1-10 to Pick 51-60. This is to be expected and tells us that the quality of player begins to decrease as we move down the draft board (better players are being selected higher up). The only place this doesn’t seem to happen is in Pick 41-50 for the Average Value Over Replacement Player. I’m not really certain why this is. It could be that during this five year stretch there were a lot of players selected from those picks that had minimal to no contribution to their team.

Draft Pick Value Per Team

First, we look at the sum of Win Shares Per Game for each draft pick bin. I added up the win shares per game for each player the team selected in each of the draft pick bins and then summed those up to obtain a 5 year “Value Add”. I then standardized the scores in order to see how each team did relative to the average Value Add during this 5 year stretch.

Screen Shot 2016-04-17 at 4.35.42 PM
NOTE:
There is a limitation with this analysis in that I didn’t have a way of going through each player to see if they played for their draft team over the entire 5 year period. It is entirely possible that some players moved on or maybe got drafted and immediately traded and never had a chance to play with their draft team (as we will see when we discuss Philadelphia). That being said, what quickly jumps out is that 6 teams appear to be very good at identifying those who will be valuable NBA players, whether they still play on their draft team or not – Houston, Cleveland, Detroit, Denver, Minnesota, and Utah. It is important to keep in mind, however, that some of these scores might be coming from one or two players during this five year period. For example, guys like Karl-Anthony Towns (Minnesota) and Kyrie Irving (Cleveland) make significant contributions to their teams in terms of Value Add. Both players were also #1 draft picks.

Another interesting observation is the value Houston, Cleveland, and Detroit were able to find in Picks 31-40. Those three teams stand alone in that draft pick bin as all of the other teams seem to lack the ability to find valuable players. Houston looks to be pretty incredible at identifying talented players as they are green in several of the draft bins and have had the most success in drafting (using Wins Shares as the metric of success) compared to other teams over this period. Houston also happens to be a team that is praised for their analytic savviness and perhaps this helps contribute to their ability to scout talent.

In looking at this chart, Philadelphia doesn’t appear to be doing too bad (7th ranked team). However, it is important to keep in mind the limitation of this chart in that some players might be adding value for teams other than the team which drafted them. I do give Philly credit for identifying some of the players as potentially successful players but trading them away doesn’t help. This will be discussed later in the article.

 Next, we turn our attention to the Value Over Replacement Metric. For this analysis I took the average Value Over Replacement for each of the draft pick bins for each team. I then took the average of every draft pick bin for each team and created a 5 Year Average Value Over Replacement Player. This metric was then standardized for all teams to investigate how they did relative to the rest of the league.

 Screen Shot 2016-04-17 at 4.54.03 PM

Now we get a little bit of a different look at the league and how successful teams draft players. As in the above analysis, there is a similar limitation in that players may have moved on from the team that drafted them; however, the main goal is to understand who is good at identifying talent.

We still see Houston in the top 6. Not only are they selecting players that are adding win value but these players are also contributing more than the replacement player would. Golden State, who was in the top 10 on the previous chart, looks to steal the show here with players above the replacement level player. Philadelphia takes a bit of a hit in this chart.

So What is Going on in Philly?

This is a tough one to sort out. As I alluded to above, sometimes teams draft players and then move those players on to other teams. Philly has been accused of tanking in order to get better draft picks and if you are going to try and go out of your way to get better draft picks then you need to ensure those draft picks actually turn into great players. Otherwise, you just end up being in the same position next year. Philly drafted 21 players over the past 5 years – well above the norm for an NBA team during this time.

  • Of the 21 players drafted only 7 of those players actually ended up playing for the team in some capacity.
  • Of those 7 players, only 4 of them remain with the team.
  • Of those 4 players, one is Joel Embiid, who has not played a game in his first 2 seasons with the team due to injury. Embiid was the 3rd round pick in the 2014 draft and has proven, thus far, to be a very costly selection for the franchise.

Here is an overview of the 21 players Philly has selected in the past 5 years:

Screen Shot 2016-04-17 at 5.26.42 PMPlayers in red are players that are no longer in the NBA or never even made it into an NBA game. That is 10 out of Philadelphia’s 21 picks (48%) who either don’t play in the NBA anymore or never made it in the first place. Stockpiling picks in the hope that a few of them turn into something valuable might not be a horrible idea, but when almost 50% of the players have washed out of the league it may be hard to justify this strategy. Moreover, 33% of the players drafted no longer play on the team. This is including the former Rookie of the year, Michael Carter-Williams and Maurice Harkless (8.5 win shares and a value above replacement player of 1.9) who was traded for Andrew Bynum (who turned out to be an NBA bust). With only 19% (4 out of 21) of the drafted players still on the team (counting Embiid who has made no contribution at all due to injury) it appears to have been a pretty unsuccessful 5 years of drafting. The team was 10-72 this season and didn’t show much improvement over years past. Perhaps the tanking era isn’t over yet in Philly?

Conclusion

Drafting players is really difficult. There are a lot of things that go into it and some may say it is a lot of luck. That being said, there are some teams that seem to come out on top or near the top, year-after-year. You can have those big luck years where you snag a lot of great talent and hit a home run but I think more importantly you just need to be consistent. The big luck years are good but the years where you are consistently bad end up setting you back. As discussed in the Motomura paper, having a well run organization that understands how to not only develop talent but also bring in veteran players to surround the younger players and take some of the pressure off might be the most important thing. Too often I think teams try and tank with the idea that their first round pick is going to save the franchise next season. Instead, they should consider the things they need to do to help that first round pick develop into the player they need him to be, down the road, in order to save the franchise.

References

Motomura A, Roberts KV, Leeds DM, Leeds MA. Does it Pay to Build Through the Draft in the National Basketball Association? J Sports Economics 2016. 1-16.

 

Daily Undulating Periodization & Performance Improvements in Powerlifters

Dr. Mike Zourdos and colleagues just published a new paper on Daily Undulating Periodization (Zourdos MC, et al. Modified Daily Undulating Periodization Model Produces Greater Performance Than a Traditional Configuration in Powerlifters. J Strength Cond Res 2015. Published Ahead of Print). Being a fan of the Daily Undulating Periodization approach to training structure I thought I would summarize the paper and share some of my thoughts.

Subjects

  • 18 Male, college-aged powerlifters
  • Subjects were assigned to one of two groups: Hypertrophy, Strength, & Power (HSP) or Hypertrophy, Power, & Strength (HPS)
  • The groups were balanced to ensure that relative and absolute strength were similar

Training Programs

  • Hypertrophy, Strength, & Power: This group performed three sessions per week, on non-consecutive days. Day 1 had a primary emphasis of hypertrophy, day 2 had an emphasis of strength, and day 3 had an emphasis of power.
  • Hypertrophy, Power, & Strength: This group performed three sessions per week, on non-consecutive days. Day 1 had a primary emphasis of hypertrophy, day 2 had an emphasis of power, and day 3 had an emphasis of strength.
  • The rationale for testing the outcome between these two weekly training schemes is that in the former, which is a common weekly set up for Daily Undulating Periodization in research, the strength session takes place ~48 hours following the hypertrophy session, which is the higher volume training session of the three. This may create an issue with the subject’s ability to perform their strength session due to the lack of recovery from the high volume hypertrophy session.
  • The variables for each of the training days are described in the chart below:

Screen Shot 2015-09-06 at 2.52.09 PM

Summary of Strength Results

The strength change results from both of the 6-week training programs are summarized as follows:

Screen Shot 2015-09-06 at 3.14.57 PM

  • No statistical difference in the squat and deadlift were found between groups; however a statistical improvement was seen in the bench press for the HPS group compared to the HSP group.
  • No statistical difference was found between groups for powerlifting total.
  • Effect sizes greater than 0.5 were noted for the squat, bench press, and powerlifting total in favor of HPS, which may suggest a practically significant improvement in HPS versus HSP when developing training programs for powerlifters.

Comments & Thought

This was an interesting study and I like the approach of trying to find an optimal scheme within the training week. Perhaps someday we may find that the optimal scheme for the Daily Undulating Periodization Model (or any training model!) is one where the emphasis of training on a given day is dictated based on how the athlete reports and what they are able to tolerate? This very fluid approach to programming – where we are attempting to strike a balance between training variety, to prevent monotony, and a concentrated dose of training, to increase fitness in a certain capacity – has been suggested by John Kiely’s work on periodization. In the paper by Zourdos and colleagues, they used an autoregulation approach on the hypertrophy day to dictate the training load/intensity for that session (an approach discussed by Mel Siff in Supertraining and researched by Bryan Mann, as referenced above). Perhaps, in a practical setting, we could extend this a bit further and utilize a linear position transducer or some other form of velocity based approach (the folks at PUSH have come up with an affordable and easy to use solution) to dictate the load/intensity on the power and strength training days. If the athlete is sluggish and moving the bar slowly, then lower the load to stay within a desired range of bar velocity. Additionally, because training takes place on non-consecutive days in this type of frame work (E.g., 3 sessions over 7 days) it may be possible to utilize monitoring strategies (bar velocity, daily wellness, RPE training loads, HRV, etc) to make the suggestion that the athlete take a rest day, instead of performing the scheduled training session, and see how their body is the following day and if it is prepared to tolerate the load.

The use of effect sizes in this paper allows us to get a better understanding of whether or not the average difference between groups is of practical significance. One of the things that I find  critical when looking at research on training interventions is the understanding of inter-individual differences. It is very possible that some athletes in this study responded favorably to either of the training approaches while others had no result or a poor result.  The paper also look at things like changes in total volume and some hormonal measures. When it comes to understanding responders and non-responders in training, it isn’t good enough to just say, “Some people get better and others don’t”. At some point, we need to figure out who doesn’t respond and why they don’t respond. Perhaps there is something to additionally look at in this paper with the hormonal changes and the individual’s ability to increase training volume or get stagnant during certain periods of the training program.

Hopefully this group continues to do more research on the topic of Daily Undulating Periodization because I find it to be a practical method of programming training and they have done some good work thus far that they can certainly follow up on. While Mike Zourdos tends to aim his approach at Powerlifters (I believe because he is competitive lifter himself) there are concepts within this framework that can easily be extended to training team sport athletes as well as concepts that could be used for sport coaches when establishing the weekly practice structure.