This week, Ellis Hughes and I are joined by someone who has had a big influence on both of us, David Robinson!
For those that don’t know, David has been the developer of several R packages, such as {broom}, {tidytext}, {fuzzyjoin}, and {widyr}. Additionally, each week David does a live screen cast of himself working through the TidyTuesday data set from scratch. For anyone that has never watched these, they are excellent. David covers a lot of ground in 60 minutes of these live screen casts and shows you how he quickly extracts as much information as possible from a data set that he is seeing for the first time ever.
Today, we walk through one of David’s TidyTuesday screen casts where he does some text analysis of a data set consisting of cocktail ingredients. The screen cast features the use of his newest R package, {widyr}. After some exploratory data analysis and data cleaning, David calculates correlations between categorical variables (phi coefficient) and shows us how to plot the results in a network graph. The screen cast wraps up with David showing us how {widyr} can be used for Principal Components Analysis and then a short discussion on David’s journey into data science and how blogging and public work can be incredibly valuable for developing your professional network and career.
To check out the screen cast, CLICK HERE.
To check out David’s blog, CLICK HERE.