Category Archives: TidyX Screen Cast

TidyX 56: XGBoost for pitchf/x classification

Building on the previous weeks, Ellis Hughes and I work on pitchf/x classification using the popular XGBoost algorithm.

We discuss:

The basics of XGBoost
Training an XGBoost model
Evaluating the variables of importance within the model
Tuning model parameters of an XGBoost model using the {caret} package

To watch our screen cast, CLICK HERE.

To access our code, CLICK HERE.

For previous episodes in this classification series:

TidyX 55: Decision Trees, Random Forest, & Optimization

Ellis Hughes and I continue our series on classification of MLB pitch types by working with Decision Trees and Random Forests.

We discuss:

Building a decision tree
Building a random forest
The advantage of random forests over decision trees
Tuning the random forest using the {caret} package using parallel processing
Evaluating the model’s classification accuracy overall and within pitch

To watch our screen cast, CLICK HERE.

To access our code, CLICK HERE.

Previous screen casts in this series:

TidyX 53: MLB Pitch Classification Series — EDA & Hierarchical Clustering

This week, Ellis Hughes and I begin a several week series on MLB Pitch Classification. Each week we will go over a different classification model using pitchf/x data, which we accessed via the {mlbgameday} R package.

In this first video of the series, we cover the following steps:

Loading the data
Data cleaning
Exploratory data analysis
Hierarchical cluster analysis
Interactive plotting of the dendrogram in {plotly}

To watch our screen cast, CLICK HERE.

To access our code, CLICK HERE.

TidyX 52: Creating Presentations in R with {Xaringan}

Tired of making all your nice tables and plots in R and then having to copy and paste them into power point for your next presentation?

If you answered “yes” to that question then you are going to love TidyX 52. This week, Ellis Hughes and I discuss how to make reproducible presentations in R using the {Xaringan} (pronounced cha-reen-gan) package.

Using R Markdown, you can build an entire presentation deck without having to copy and paste your work into power point. Moreover, the ease of {Xaringan} makes things like reproducibility of your work and updating of your presentation slides as simple as changing a few lines of code and hitting Knit.

To watch our screen cast, CLICK HERE.

To access our code, CLICK HERE.

TidyX 51: Building Statistical Models into Shiny Apps

We’ve done a fair bit of {shiny} over the past year, detailing different approaches to provide an interactive data environment for end users. However, one thing we haven’t done yet is talk about embedding statistical models into your {shiny} apps. This week, Ellis Hughes and I do just that.

We create a random forest classifier, which uses body size features to predict the probability that a penguin is one of three different species and embed this into a {shiny} web app. For this, we will be using the penguins data set form the {palmerpenguins} R package.

In addition to from making an interactive statistical app, we cover a few other nuances of {shiny} app development:

Organizing the user components horizontally across the top of the app versus the commonly used sidebar panel.
How to increase the size and width of the plot in the main panel.
How to center the plot within in the main panel.
How to add a title to the top of your {shiny} app.

To watch our screen cast, CLICK HERE.

To access our code, CLICK HERE.

Patrick Ward, PhD

Patrick Ward, PhD | Sports Science & Analytics

Category Archives: TidyX Screen Cast

TidyX 56: XGBoost for pitchf/x classification

TidyX 55: Decision Trees, Random Forest, & Optimization

TidyX 53: MLB Pitch Classification Series — EDA & Hierarchical Clustering

TidyX 52: Creating Presentations in R with {Xaringan}

TidyX 51: Building Statistical Models into Shiny Apps