{"id":3376,"date":"2024-03-16T00:32:09","date_gmt":"2024-03-16T00:32:09","guid":{"rendered":"http:\/\/optimumsportsperformance.com\/blog\/?p=3376"},"modified":"2024-03-16T00:32:09","modified_gmt":"2024-03-16T00:32:09","slug":"tidyx-episode-175-predicting-hall-of-fame-pitchers-using-random-forests","status":"publish","type":"post","link":"https:\/\/optimumsportsperformance.com\/blog\/tidyx-episode-175-predicting-hall-of-fame-pitchers-using-random-forests\/","title":{"rendered":"TidyX Episode 175: Predicting Hall of Fame Pitchers using Random Forests"},"content":{"rendered":"<p><span style=\"color: #0000ff;\"><strong><a style=\"color: #0000ff;\" href=\"https:\/\/twitter.com\/ellis_hughes\">Ellis Hughes<\/a><\/strong><\/span> and I continue to work with the MLB pitcher data, courtesy of the {<strong>Lahman<\/strong>} baseball package.<\/p>\n<p>This week we walk through using a random forest model to calculate the probability a pitcher will make it to the Hall of Fame given several different performance stats.<\/p>\n<p>In this episode we cover:<\/p>\n<ul>\n<li>Splitting data into training and testing sets<\/li>\n<li>Splitting training sets into cross validated folds<\/li>\n<li>Using {<strong>tidyverse<\/strong>} and {<strong>purrr<\/strong>} to construct a tuning grid and tune the random forest models to identify the optimal <em>mtry<\/em> and <em>ntrees<\/em> for the prediction task<\/li>\n<li>Fitting a final model with the optimized parameters and exploring predictions<\/li>\n<\/ul>\n<p>To watch our screen cast, <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/www.youtube.com\/watch?v=qZHPdMOHYT0\">CLICK HERE<\/a><\/span><\/strong>.<\/p>\n<p>To access our code, <span style=\"color: #0000ff;\"><strong><a style=\"color: #0000ff;\" href=\"https:\/\/github.com\/thebioengineer\/TidyX\/blob\/master\/TidyTuesday_Explained\/175-Pitchers_HOF_in_20_Random_Forest\/2024_Pitcher_HOF_RandomForest.R\">CLICK HERE<\/a><\/strong><\/span>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ellis Hughes and I continue to work with the MLB pitcher data, courtesy of the {Lahman} baseball package. This week we walk through using a random forest model to calculate the probability a pitcher will make it to the Hall of Fame given several different performance stats. In this episode we cover: Splitting data into [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[44],"tags":[],"class_list":["post-3376","post","type-post","status-publish","format-standard","hentry","category-tidyx-screen-cast"],"_links":{"self":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/3376","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/comments?post=3376"}],"version-history":[{"count":1,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/3376\/revisions"}],"predecessor-version":[{"id":3377,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/3376\/revisions\/3377"}],"wp:attachment":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/media?parent=3376"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/categories?post=3376"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/tags?post=3376"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}