{"id":3464,"date":"2024-09-30T15:46:35","date_gmt":"2024-09-30T15:46:35","guid":{"rendered":"http:\/\/optimumsportsperformance.com\/blog\/?p=3464"},"modified":"2024-09-30T15:46:35","modified_gmt":"2024-09-30T15:46:35","slug":"k-nearest-neighbor-tidymodels-tutorial","status":"publish","type":"post","link":"https:\/\/optimumsportsperformance.com\/blog\/k-nearest-neighbor-tidymodels-tutorial\/","title":{"rendered":"K-Nearest Neighbor: {tidymodels} tutorial"},"content":{"rendered":"<p>When working on data science or research teams, it often helps to have a workflow that makes it easy for teammates to review your work and add additional components to the data cleaning or model structure. Additionally, it ensures that the steps in the process are clear so that debugging is easy.<\/p>\n<p>In R, the {<strong>tidymodels<\/strong>} package offers one such workflow and in python, folks seem to prefer <strong>scikit learn<\/strong>. This week, I&#8217;m going to walk through a full workflow in tidymodels using the K-nearest neighbor algorithm. Some of the things I&#8217;ll cover include:<\/p>\n<ul>\n<li>Splitting the data in test\/train sets and cross validation folds<\/li>\n<li>Setting up the model structure (what tidymodels refers to as a recipe)<\/li>\n<li>Creating preprocessing steps<\/li>\n<li>Compiling the preprocessing and model structure together into a single workflow<\/li>\n<li>Tuning the KNN model<\/li>\n<li>Identifying the best tuned model<\/li>\n<li>Evaluating the model on the test set<\/li>\n<li>Saving the entire worflow and model to be used later on with new data<\/li>\n<li>Making predictions with the saved workflow and model on new data<\/li>\n<\/ul>\n<p>I&#8217;ve provided a number of tidymodels tutorials on this blog so feel free to search the blog for those. Additionally, all of my tidymodels tutorials and templates are available in a <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/github.com\/pw2\/tidymodels_template\">GitHub Repo<\/a><\/span><\/strong>. Alternatively, if Python is more you jam, I did do a previous blog comparing the workflow in tidymodels to Scikit Learn, which can be found <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/optimumsportsperformance.com\/blog\/comparing-tidymodels-in-r-to-scikit-learn-in-python\/\">HERE<\/a><\/span><\/strong>.<\/p>\n<p>This entire tutorial and data are available on my <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/github.com\/pw2\/tidymodels_template\/blob\/main\/KNN%20tidymodels%20tutorial.R\">GitHub page<\/a><\/span><\/strong> if you&#8217;d like to code along.<\/p>\n<p><strong>Load &amp; Clean Data<\/strong><\/p>\n<p>For this tutorial I&#8217;m going to use the 2021 FIFA Soccer Ratings, which are available on <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/www.kaggle.com\/datasets\/stefanoleone992\/fifa-21-complete-player-dataset\">Kaggle<\/a><\/span><\/strong>. There are several years of ratings there, but we will concentrate on 2021 with the goal being to estimate a player&#8217;s contract value based on the various FIFA ratings provided and the position they play.<\/p>\n<p>First, I load the tidyverse and tidymodels libraries and read in the data. I&#8217;m going to drop goalies so that we only focus on players who play the field and, since the data has a large number of columns, I&#8217;m going to get only the columns we care about: Information about the player (playerID, name, height, weight, club, league, position, and contract value) and then the various FIFA ratings.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.51.00\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3465\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.51.00\u202fAM-1024x584.png\" alt=\"\" width=\"625\" height=\"356\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.51.00\u202fAM-1024x584.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.51.00\u202fAM-300x171.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.51.00\u202fAM-768x438.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.51.00\u202fAM-624x356.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.51.00\u202fAM.png 1630w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>The only column with missing data that impacts our analysis is the team position column, as this is a feature in the data set. Additionally, there are 207 players with a contract value of 0 (all 195 players missing a team position have a 0 contract value). So, perhaps these are players who weren&#8217;t on a club at the time the ratings were assigned.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.53.31\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3466\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.53.31\u202fAM-1024x561.png\" alt=\"\" width=\"625\" height=\"342\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.53.31\u202fAM-1024x561.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.53.31\u202fAM-300x164.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.53.31\u202fAM-768x421.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.53.31\u202fAM-624x342.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.53.31\u202fAM.png 1198w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>I&#8217;m going to remove these players from our analysis data set, but I&#8217;m going to place them in their own data set because we will use them later to make estimates of their contract value.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.56.37\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3467\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.56.37\u202fAM-1024x163.png\" alt=\"\" width=\"625\" height=\"99\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.56.37\u202fAM-1024x163.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.56.37\u202fAM-300x48.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.56.37\u202fAM-768x122.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.56.37\u202fAM-624x100.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.56.37\u202fAM.png 1530w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p><strong>Visualizing the Data<\/strong><\/p>\n<p>Let&#8217;s take a look at a visual of the contract value for all of the players in our data set. Since this is severely right skewed, we will plot it on the log scale.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.58.06\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3468\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.58.06\u202fAM.png\" alt=\"\" width=\"459\" height=\"387\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.58.06\u202fAM.png 1006w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.58.06\u202fAM-300x253.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.58.06\u202fAM-768x647.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.58.06\u202fAM-624x526.png 624w\" sizes=\"auto, (max-width: 459px) 100vw, 459px\" \/><\/a><\/p>\n<p>We also notice there are a variety of positions (way more than I would have guessed in soccer!).<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.59.12\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3469\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.59.12\u202fAM-811x1024.png\" alt=\"\" width=\"326\" height=\"412\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.59.12\u202fAM-811x1024.png 811w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.59.12\u202fAM-237x300.png 237w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.59.12\u202fAM-768x970.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.59.12\u202fAM-624x788.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-7.59.12\u202fAM.png 1056w\" sizes=\"auto, (max-width: 326px) 100vw, 326px\" \/><\/a><\/p>\n<p>Finally, we want to explore how playing position might influence contract value.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.00.43\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3470\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.00.43\u202fAM-770x1024.png\" alt=\"\" width=\"363\" height=\"483\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.00.43\u202fAM-770x1024.png 770w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.00.43\u202fAM-226x300.png 226w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.00.43\u202fAM-768x1022.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.00.43\u202fAM-624x830.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.00.43\u202fAM.png 1072w\" sizes=\"auto, (max-width: 363px) 100vw, 363px\" \/><\/a><\/p>\n<p><strong>tidymodels set up<\/strong><\/p>\n<p>Now that we have our data organized we want to set up a tidymodels workflow to estimate the contract value of every player using a K-nearest neighbor model.<\/p>\n<p>First, we split the data in train and test splits and then further split the training data into 5 cross validation folds so that we can tune the model to find the best number of K-neighbors.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.04.47\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3472\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.04.47\u202fAM.png\" alt=\"\" width=\"398\" height=\"312\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.04.47\u202fAM.png 544w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.04.47\u202fAM-300x235.png 300w\" sizes=\"auto, (max-width: 398px) 100vw, 398px\" \/><\/a><\/p>\n<p>Next, we specify that we want to use a KNN model and the outcome variable as regression, since it is continuous (as opposed to classification).<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.06.17\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3473\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.06.17\u202fAM.png\" alt=\"\" width=\"471\" height=\"92\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.06.17\u202fAM.png 884w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.06.17\u202fAM-300x58.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.06.17\u202fAM-768x149.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.06.17\u202fAM-624x121.png 624w\" sizes=\"auto, (max-width: 471px) 100vw, 471px\" \/><\/a><\/p>\n<p>Now that the model is specified we need to set up our preprocessing steps. This is one thing that tidymodels is super useful for. Normally, we&#8217;d need to do all this preprocessing before fitting the model and, as in the case of KNN and most machine learning models, we&#8217;d need to standardize the variables in the training set and then store the means and standard deviations of those variables so that we can use them to standardize the test set or future data. In tidymodels, we set up our preprocessing workflow and store it and it contains all that information for us! We can simply load it and use it for any downstream analysis we want to do. There are three preprocessing steps we want to add to our workflow:<\/p>\n<ol>\n<li>log transform the dependent variable (contract value)<\/li>\n<li>Dummy code the positions<\/li>\n<li>Standardize (z-score) all off the numeric variable in the model (the ratings)<\/li>\n<\/ol>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.07.02\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3474\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.07.02\u202fAM-1024x221.png\" alt=\"\" width=\"625\" height=\"135\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.07.02\u202fAM-1024x221.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.07.02\u202fAM-300x65.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.07.02\u202fAM-768x166.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.07.02\u202fAM-624x135.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.07.02\u202fAM.png 1462w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>We can view the preprocessing steps using the <strong>prep()<\/strong> function.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.12.32\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3475\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.12.32\u202fAM-1024x504.png\" alt=\"\" width=\"625\" height=\"308\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.12.32\u202fAM-1024x504.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.12.32\u202fAM-300x148.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.12.32\u202fAM-768x378.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.12.32\u202fAM-624x307.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.12.32\u202fAM.png 1414w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>With the model specified and the preprocessing recipe set up we are ready to compile everything into a single workflow.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.14.27\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3476\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.14.27\u202fAM.png\" alt=\"\" width=\"383\" height=\"127\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.14.27\u202fAM.png 524w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.14.27\u202fAM-300x100.png 300w\" sizes=\"auto, (max-width: 383px) 100vw, 383px\" \/><\/a><\/p>\n<p>We can look at the workflow to make sure everything makes sense before we start fitting the model.<a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.15.15\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3477\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.15.15\u202fAM.png\" alt=\"\" width=\"465\" height=\"416\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.15.15\u202fAM.png 836w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.15.15\u202fAM-300x268.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.15.15\u202fAM-768x687.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.15.15\u202fAM-624x558.png 624w\" sizes=\"auto, (max-width: 465px) 100vw, 465px\" \/><\/a><\/p>\n<p>Next, we tune the model using the cross validated folds that we set up on our training data set. We are trying to find the optimal number of k-neighbors that minimizes the RMSE of the outcome variable (the log of contract value).<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.10\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3478\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.10\u202fAM-1024x771.png\" alt=\"\" width=\"625\" height=\"471\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.10\u202fAM-1024x771.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.10\u202fAM-300x226.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.10\u202fAM-768x578.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.10\u202fAM-624x470.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.10\u202fAM.png 1342w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>We see the results are stored in a list for every one of our 5 cross validation splits. We can view the model metrics for each of the folds.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.54\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3479\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.54\u202fAM-1024x420.png\" alt=\"\" width=\"625\" height=\"256\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.54\u202fAM-1024x420.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.54\u202fAM-300x123.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.54\u202fAM-768x315.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.54\u202fAM-624x256.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.17.54\u202fAM.png 1156w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>The model with 10 neighbors appears to have the lowest RMSE. So, we want to pull that value directly from this table so that we can store it and use it later when we go to fit the final model.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.19.25\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3480\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.19.25\u202fAM.png\" alt=\"\" width=\"502\" height=\"98\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.19.25\u202fAM.png 862w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.19.25\u202fAM-300x58.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.19.25\u202fAM-768x150.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.19.25\u202fAM-624x122.png 624w\" sizes=\"auto, (max-width: 502px) 100vw, 502px\" \/><\/a><\/p>\n<p><strong>Fitting the Final Model Version 1: Using finalize_workflow()<\/strong><\/p>\n<p>Version 1 that I&#8217;ll show for fitting the final model uses the <strong>finalize_workflow()<\/strong> function. I like this approach if I&#8217;m building my model in a local session and then want to see how well it performs on the test session (which is what this function does). This isn&#8217;t the approach I take when I need to save everything for downstream use (which we will cover in Version 2).<\/p>\n<p>First, fit the best model using the <strong>finalize_workflow()<\/strong> function.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.22.32\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3481\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.22.32\u202fAM-1024x171.png\" alt=\"\" width=\"625\" height=\"104\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.22.32\u202fAM-1024x171.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.22.32\u202fAM-300x50.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.22.32\u202fAM-768x128.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.22.32\u202fAM-624x104.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.22.32\u202fAM.png 1172w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.23.28\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3482\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.23.28\u202fAM.png\" alt=\"\" width=\"480\" height=\"412\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.23.28\u202fAM.png 882w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.23.28\u202fAM-300x257.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.23.28\u202fAM-768x658.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.23.28\u202fAM-624x535.png 624w\" sizes=\"auto, (max-width: 480px) 100vw, 480px\" \/><\/a><\/p>\n<p>Then, we get the model predictions for this final workflow on the the test data set.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.24.17\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3483\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.24.17\u202fAM.png\" alt=\"\" width=\"469\" height=\"120\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.24.17\u202fAM.png 756w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.24.17\u202fAM-300x77.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.24.17\u202fAM-624x160.png 624w\" sizes=\"auto, (max-width: 469px) 100vw, 469px\" \/><\/a><\/p>\n<p>We can plot the predicted contract value compared to the actual contract value and calculate the test set RMSE.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.01\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3484\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.01\u202fAM-1024x855.png\" alt=\"\" width=\"462\" height=\"386\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.01\u202fAM-1024x855.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.01\u202fAM-300x250.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.01\u202fAM-768x641.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.01\u202fAM-624x521.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.01\u202fAM.png 1078w\" sizes=\"auto, (max-width: 462px) 100vw, 462px\" \/><\/a> <a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.11\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3485\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.11\u202fAM.png\" alt=\"\" width=\"495\" height=\"172\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.11\u202fAM.png 894w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.11\u202fAM-300x104.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.11\u202fAM-768x266.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.26.11\u202fAM-624x216.png 624w\" sizes=\"auto, (max-width: 495px) 100vw, 495px\" \/><\/a><\/p>\n<p><strong>Fitting the Final Model Version 2: Re-sepcify the model on the training set, save it, and use it later<br \/>\n<\/strong><\/p>\n<p>In this second version of fitting the final model, we will take the best neighbors from the tuning phase, re-specify the model to the training set, reset the workflow, and then save these components so that we can use them later.<\/p>\n<p>First, re-specify the model and notice I set the <strong>neighbors<\/strong> argument to <strong>best_neighbors<\/strong>, which we saved above when we tuned the model.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.40\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3486\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.40\u202fAM-1024x110.png\" alt=\"\" width=\"625\" height=\"67\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.40\u202fAM-1024x110.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.40\u202fAM-300x32.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.40\u202fAM-768x82.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.40\u202fAM-624x67.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.40\u202fAM.png 1100w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>We then use this finalized\/re-specified model to reset the workflow. Notice it is added to the <strong>add_model() <\/strong>function however the recipe, with our preprocessing steps, does not change.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.54\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3487\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.54\u202fAM.png\" alt=\"\" width=\"522\" height=\"126\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.54\u202fAM.png 696w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.54\u202fAM-300x72.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.28.54\u202fAM-624x151.png 624w\" sizes=\"auto, (max-width: 522px) 100vw, 522px\" \/><\/a><\/p>\n<p>With the workflow set up we now need to do the final two sets:<\/p>\n<ol>\n<li>Fit the model to the entire data set and extract the recipe using <strong>extract_recipe()<\/strong> since we will need to save this to preprocess any new data before making predictions.<\/li>\n<li>Fit the model to the final data and extract the model itself, using <strong>extract_fit_parsnip()<\/strong> so we can use the model to make future predictions.<\/li>\n<\/ol>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.31.08\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3488\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.31.08\u202fAM-1024x114.png\" alt=\"\" width=\"625\" height=\"70\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.31.08\u202fAM-1024x114.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.31.08\u202fAM-300x33.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.31.08\u202fAM-768x85.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.31.08\u202fAM-624x69.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.31.08\u202fAM.png 1440w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.33.09\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3489\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.33.09\u202fAM.png\" alt=\"\" width=\"482\" height=\"94\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.33.09\u202fAM.png 636w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.33.09\u202fAM-300x58.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.33.09\u202fAM-624x122.png 624w\" sizes=\"auto, (max-width: 482px) 100vw, 482px\" \/><\/a><\/p>\n<p>Now we save the recipe and model as <strong>.rda <\/strong>files. We can then load these two structures and use them on new data!<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.35.10\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3490\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.35.10\u202fAM.png\" alt=\"\" width=\"472\" height=\"111\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.35.10\u202fAM.png 836w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.35.10\u202fAM-300x70.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.35.10\u202fAM-768x180.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.35.10\u202fAM-624x146.png 624w\" sizes=\"auto, (max-width: 472px) 100vw, 472px\" \/><\/a><\/p>\n<p>We have 207 players with 0 contract value, of which 195 had no team position. So, that leaves us with 12 players that we can estimate contract value for with our saved model.<\/p>\n<p>First, we use the <strong>bake()<\/strong> function to apply the saved recipe to the new data so that we can preprocess everything appropriately.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.29\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3491\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.29\u202fAM.png\" alt=\"\" width=\"856\" height=\"116\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.29\u202fAM.png 856w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.29\u202fAM-300x41.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.29\u202fAM-768x104.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.29\u202fAM-624x85.png 624w\" sizes=\"auto, (max-width: 856px) 100vw, 856px\" \/><\/a><\/p>\n<p>With the new data preprocessed we can now predict the log of contract value with our saved model.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.56\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3492\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.56\u202fAM-1024x429.png\" alt=\"\" width=\"625\" height=\"262\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.56\u202fAM-1024x429.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.56\u202fAM-300x126.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.56\u202fAM-768x322.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.56\u202fAM-624x261.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.39.56\u202fAM.png 1198w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><br \/>\nWe can add these predictions back to the new data set, exponentiate the predicted contract value to get it back to the normal scale and then create a visual of our estimates for the 12 players.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.12\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3493\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.12\u202fAM.png\" alt=\"\" width=\"500\" height=\"60\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.12\u202fAM.png 820w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.12\u202fAM-300x36.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.12\u202fAM-768x92.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.12\u202fAM-624x75.png 624w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><\/a> <a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.22\u202fAM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-3494\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.22\u202fAM-1024x867.png\" alt=\"\" width=\"482\" height=\"408\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.22\u202fAM-1024x867.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.22\u202fAM-300x254.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.22\u202fAM-768x651.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.22\u202fAM-624x529.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/09\/Screenshot-2024-09-30-at-8.42.22\u202fAM.png 1072w\" sizes=\"auto, (max-width: 482px) 100vw, 482px\" \/><\/a><\/p>\n<p><strong>Wrapping Up<\/strong><\/p>\n<p>That&#8217;s a quick walk through on how to set up a complete tdymodels workflow. This approach would work for any type of model you want to build, not just KNN! As always, the code and data are available on my <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/github.com\/pw2\/tidymodels_template\/blob\/main\/KNN%20tidymodels%20tutorial.R\">GitHub page<\/a><\/span><\/strong>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When working on data science or research teams, it often helps to have a workflow that makes it easy for teammates to review your work and add additional components to the data cleaning or model structure. Additionally, it ensures that the steps in the process are clear so that debugging is easy. In R, the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[47,43],"tags":[],"class_list":["post-3464","post","type-post","status-publish","format-standard","hentry","category-model-building-in-r","category-sports-analytics"],"_links":{"self":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/3464","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/comments?post=3464"}],"version-history":[{"count":1,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/3464\/revisions"}],"predecessor-version":[{"id":3495,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/3464\/revisions\/3495"}],"wp:attachment":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/media?parent=3464"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/categories?post=3464"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/tags?post=3464"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}