TidyX Episode 16: Web Scraping & NBA Shot Charts

This week, Ellis Hughes and I start by breaking down the code that Jihong Zhang wrote to visualize Caribou Movements in Canada from data provided by the TidyTuesday Project. The data is spatial tracking data and Jihong plotted this data over top of a google map. Since spatial data is currently very popular in sport, we decided to create our own plots of NBA Shot Charts using three different approaches (scatter plots, hexbins, and heat maps). To obtain this data, we walk through our code on web scraping.

This screen cast covers a number of key topics in data science:

1. Obtaining data via web scraping.
2. Dealing with regularized expressions.
3. Visualizing data.
4. Some things to consider when joining tables (NOTE: I did a BLOG ARTICLE a few months ago that details the various JOIN functions in {tidyverse}, so it may be worthwhile to check that out).

TidyX Episode 15: Juneteenth & Census Tables

In TidyX Episode 15, Ellis Hughes shows us how to quickly build a report of conditionally formatted data tables using the {colortable} and  {knitr} packages. Also, a quick tip on how to create a table of contents within your {Rmarkdown} reports to allow your readers an easy way to navigate the data. This weeks data comes from the TidyTuesday Project and uses census data to show America’s history with slavery in different regions across the country.

TidyX Episode 14: Strip Plots & Dotplots

This week, Ellis Hughes and I go over strip plots and dotplots using the code provided by Catriona Cunningham. Catriona used these two data visualization approaches to create some really nice looking plots detailing African American Achievements from a data set provided by the Tidy Tuesday Project.

TidyX Episode 13: Marble Races & Bump Plots

This week on TidyX, Ellis Hughes and I discuss bump plots built by Cedric Scherer using data on Jelle’s Marbe Race results, provided by TidyTuesday. If you’ve never seen Cedric’s work, he is a great follow on Twitter as he makes some very interesting data visualizations.

We follow up Cedric’s code by building a bump plot of our own using NFL Combine 40yd Sprint Times. We go over different approaches for scraping and organizing the data from both a base R and tidyverse approach.

TidyX Episode 12: Data Cleaning & Thomas Mock

This week, Ellis Hughes and I welcome our first guest to the episode, TidyTuesday creator, Thomas Mock!

Collectively, we go over a code submission by Joshua de la Bruere, who shares a code that provides a good lesson in data cleaning for a rather messy data set. We then have a discussion with Thomas regarding all things R, TidyTuesday, and a bit about his PhD in Neuroscience!

