As a means of working on improving some of my Python skills, I decided I’ll attempt to re-create different elements from some of our TidyX Screen Casts.
This past week, we did an episode on building a random forest classifier for coffee ratings (CLICK HERE). I’ve recreated almost all of the steps that we did in R in Python Code.
1) Loading the data from the TidyTuesday github page.
2) Data pre-processing
3) Exploratory data analysis
4) Random Forest classifier development and model testing
You can access the full Jupyter Notebook on my GITHUB page. I’m still trying to get the hang of Python so if there are any Pythonistas out there that have feedback or see errors in my code, I’m all ears!