{"id":2592,"date":"2022-07-23T19:56:12","date_gmt":"2022-07-23T19:56:12","guid":{"rendered":"http:\/\/optimumsportsperformance.com\/blog\/?p=2592"},"modified":"2022-11-08T03:34:15","modified_gmt":"2022-11-08T03:34:15","slug":"optimization-algorithms-in-r-returning-model-fit-metrics","status":"publish","type":"post","link":"https:\/\/optimumsportsperformance.com\/blog\/optimization-algorithms-in-r-returning-model-fit-metrics\/","title":{"rendered":"Optimization Algorithms in R &#8211; returning model fit metrics"},"content":{"rendered":"<p><span style=\"text-decoration: underline;\"><strong>Introduction<\/strong><\/span><\/p>\n<p>A colleague had asked me if I knew of a way to obtain model fit metrics, such as AIC or r-squared, from the <strong>optim()<\/strong> function. First, <strong>optim()<\/strong> provides a general-purpose method of optimizing an algorithm to identify the best weights for either minimizing or maximizing whatever success metric you are comparing your model to (e.g., sum of squared error, maximum likelihood, etc.). From there, it continues until the model coefficients are optimal for the data.<\/p>\n<p>To make <strong>optim()<\/strong> work for us, we need to code the aspects of the model we are interested in optimizing (e.g., the regression coefficients) as well as code a function that calculates the output we are comparing the results to (e.g., sum of squared error).<\/p>\n<p>Before we get to model fit metrics, let&#8217;s walk through how <strong>optim()<\/strong> works by comparing our results to a simple linear regression. I&#8217;ll admit, <strong>optim()<\/strong> can be a little hard to wrap your head around (at least for me, it was), but building up a simple example can help us understand the power of this function and how we can use it later on down the road in more complicated analysis.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Data<\/strong><\/span><\/p>\n<p>We will use data from the <strong>Lahman<\/strong> baseball data base. I&#8217;ll stick with all years from 2006 on and retain only players with a minimum of 250 at bats per season.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nlibrary(Lahman)\r\nlibrary(tidyverse)\r\n\r\ndata(Batting)\r\n\r\ndf &lt;- Batting %&gt;% \r\n  filter(yearID &gt;= 2006,\r\n         AB &gt;= 250)\r\n\r\nhead(df)\r\n<\/pre>\n<p>&nbsp;<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Linear Regression<\/strong><\/span><\/p>\n<p>First, let&#8217;s just write a linear regression to predict HR from Hits, so that we have something to compare our <strong>optim()<\/strong> function against.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nfit_lm &lt;- lm(HR ~ H, data = df)\r\nsummary(fit_lm)\r\n\r\nplot(df$H, df$HR, pch = 19, col = &quot;light grey&quot;)\r\nabline(fit_lm, col = &quot;red&quot;, lwd = 2)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.29.01-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2593\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.29.01-PM-959x1024.png\" alt=\"\" width=\"442\" height=\"472\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.29.01-PM-959x1024.png 959w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.29.01-PM-281x300.png 281w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.29.01-PM-768x820.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.29.01-PM-624x666.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.29.01-PM.png 1160w\" sizes=\"auto, (max-width: 442px) 100vw, 442px\" \/><\/a><\/p>\n<p>The above model is a simple linear regression model that fits a line of best fit based on the squared error of predictions to the actual values.<\/p>\n<p>(<strong>NOTE:<\/strong> <em>We can see from the plot that this relationship is not really linear, but it will be okay for our purposes of discussion here.<\/em>)<\/p>\n<p>We can also use an optimizer to solve this.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Optimizer<\/strong><\/span><\/p>\n<p>To write an optimizer, we need two functions:<\/p>\n<ol>\n<li>\u00a0A function that allow us to model the relationship between HR and H and identify the optimal coefficients for the intercept and the slope that will help us minimize the residual between the actual and predicted values.<\/li>\n<li>\u00a0A function that helps us keep track of the error between actual and predicted values as <strong>optim()<\/strong> runs through various iterations, attempting a number of possible coefficients for the slope and intercept.<\/li>\n<\/ol>\n<p><em><strong>Function 1: A linear function to predict HR from hits<\/strong><\/em><\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nhr_from_h &lt;- function(H, a, b){\r\n  return(a + b*H)\r\n}\r\n<\/pre>\n<p>This simple function is just a linear equation and takes the values of our independent variable (H) and a value for the intercept (a) and slope (b). Although, we can plug in numbers and use the function right now, the values of a and b have not been optimized yet. Thus, the model will return a weird prediction for HR. Still, we can plug in some values to see how it works. For example, what if the person has 30 hits.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nhr_from_h(H = 30, a = 1, b = 1)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.33.24-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2594\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.33.24-PM.png\" alt=\"\" width=\"411\" height=\"60\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.33.24-PM.png 534w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.33.24-PM-300x44.png 300w\" sizes=\"auto, (max-width: 411px) 100vw, 411px\" \/><\/a><\/p>\n<p>Not a really good prediction of home runs!<\/p>\n<p><em><strong>Function 2: Sum of Squared Error Function<\/strong><\/em><\/p>\n<p>Optimizers will try and identify the key weights in the function by either maximizing or minimizing some value. Since we are using a linear model, it makes sense to try and minimize the sum of the squared error.<\/p>\n<p>The function will take 3 inputs:<\/p>\n<ol>\n<li>\u00a0A data set of our dependent and independent variables<\/li>\n<li>\u00a0A value for our intercept (a)<\/li>\n<li>\u00a0A value for the slope of our model (b)<\/li>\n<\/ol>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nsum_sq_error &lt;- function(df, a, b){\r\n  \r\n  # make predictions for HR\r\n  predictions &lt;- with(df, hr_from_h(H, a, b))\r\n  \r\n  # get model errors\r\n  errors &lt;- with(df, HR - predictions)\r\n  \r\n  # return the sum of squared errors\r\n  return(sum(errors^2))\r\n  \r\n}\r\n<\/pre>\n<p>&nbsp;<\/p>\n<p>Let&#8217;s make a fake data set of a few values of H and HR and then assign some weights for <em><strong>a<\/strong><\/em> and <em><strong>b<\/strong><\/em> to see if the <strong>sum_sq_error()<\/strong> produces a single sum of squared error value.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nfake_df &lt;- data.frame(H = c(35, 40, 55), \r\n                        HR = c(4, 10, 12))\r\n\r\nsum_sq_error(fake_df,\r\n             a = 3, \r\n             b = 1)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.36.00-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2595\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.36.00-PM.png\" alt=\"\" width=\"522\" height=\"192\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.36.00-PM.png 668w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.36.00-PM-300x110.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.36.00-PM-624x230.png 624w\" sizes=\"auto, (max-width: 522px) 100vw, 522px\" \/><\/a><\/p>\n<p>It worked! Now let&#8217;s write an optimization function to try and find the ideal weights for\u00a0<em><strong>a<\/strong><\/em> and <em><strong>b<\/strong><\/em> that minimize that sum of squared error. One way to do this is to create a large grid of values, write a <strong>for<\/strong> loop and let R plug along, trying each value, and then find the optimal values that minimize the sum of squared error. The issue with this is that if you have models with more independent variables, it will take really long. A more efficient way is to write an optimizer that can take care of this for us.<\/p>\n<p>We will use the <strong>optim()<\/strong> function from base R.<\/p>\n<p>The <strong>optim()<\/strong> function takes 2 inputs:<\/p>\n<ol>\n<li>\u00a0A numeric vector of starting points for the parameters you are trying to optimize. These can be any values.<\/li>\n<li>\u00a0A function that will receive the vector of starting points. This function will contain <em><strong>all<\/strong><\/em> of the parameters that we want to optimize. This function will take our <strong>sum_sq_error()<\/strong> function and it will get passed the starting values for <strong>a<\/strong> and <strong>b<\/strong> and then find their values that minimize the sum of squared error.<\/li>\n<\/ol>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\noptimizer_results &lt;- optim(par = c(0, 0),\r\n                           fn = function(x){\r\n                             sum_sq_error(df, x&#x5B;1], x&#x5B;2])\r\n                             }\r\n                           )\r\n<\/pre>\n<p>Let&#8217;s have a look at the results.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.39.56-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2596\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.39.56-PM.png\" alt=\"\" width=\"289\" height=\"388\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.39.56-PM.png 422w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.39.56-PM-224x300.png 224w\" sizes=\"auto, (max-width: 289px) 100vw, 289px\" \/><\/a><\/p>\n<p>In the output:<\/p>\n<ul>\n<li><strong>value<\/strong> tells us the sum of the squared error<\/li>\n<li><strong>par<\/strong> tells us the weighting for\u00a0<em><strong>a<\/strong><\/em> (intercept) and\u00a0<em><strong>b<\/strong><\/em> (slope)<\/li>\n<li><strong>counts<\/strong> tells us how many times the optimizer ran the function. The <strong>gradient<\/strong> is NA here because we didn&#8217;t specify a gradient argument in our optimizer<\/li>\n<li><strong>convergence<\/strong> tells us if the optimizer found the optimal values (when it goes to 0, that means everything worked out)<\/li>\n<li><strong>message<\/strong> is any message that R needs to inform us about when running the optimizer<\/li>\n<\/ul>\n<p>Let&#8217;s focus on <strong>par<\/strong> and <strong>value<\/strong> since those are the two values we really want to know about.<\/p>\n<p>First, notice how the values for\u00a0<em><strong>a<\/strong><\/em> and\u00a0<em><strong>b<\/strong><\/em> are nearly the exact same values we got from our linear regression.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\na_optim_weight &lt;- optimizer_results$par&#x5B;1]\r\nb_optim_weight &lt;- optimizer_results$par&#x5B;2]\r\n\r\na_reg_coef &lt;- fit_lm$coef&#x5B;1]\r\nb_reg_coef &lt;- fit_lm$coef&#x5B;2]\r\n\r\na_optim_weight\r\na_reg_coef\r\n\r\nb_optim_weight\r\nb_reg_coef\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.43.01-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2597\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.43.01-PM.png\" alt=\"\" width=\"391\" height=\"347\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.43.01-PM.png 668w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.43.01-PM-300x266.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.43.01-PM-624x553.png 624w\" sizes=\"auto, (max-width: 391px) 100vw, 391px\" \/><\/a><\/p>\n<p>Next, we can see that the sum of squared error from the optimizer is the same as the sum of squared error from the linear regression.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nsse_optim &lt;- optimizer_results$value\r\nsse_reg &lt;- sum(fit_lm$residuals^2)\r\n\r\nsse_optim\r\nsse_reg\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.44.04-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2598\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.44.04-PM.png\" alt=\"\" width=\"373\" height=\"161\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.44.04-PM.png 570w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.44.04-PM-300x129.png 300w\" sizes=\"auto, (max-width: 373px) 100vw, 373px\" \/><\/a><\/p>\n<p>We can finish by plotting the two regression lines over the data and show that they produce the same fit.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nplot(df$H, \r\n     df$HR,\r\n     col = &quot;light grey&quot;,\r\n     pch = 19,\r\n     xlab = &quot;Hits&quot;,\r\n     ylab = &quot;Home Runs&quot;)\r\ntitle(main = &quot;Predicting Home Runs from Hits&quot;,\r\n     sub = &quot;Red Line = Regression | Black Dashed Line = Optimizer&quot;)\r\nabline(fit_lm, col = &quot;red&quot;, lwd = 5)\r\nabline(a = a_optim_weight,\r\n       b = b_optim_weight,\r\n       lty = 2,\r\n       lwd = 3,\r\n       col = &quot;black&quot;)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.45.06-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2599\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.45.06-PM-891x1024.png\" alt=\"\" width=\"507\" height=\"582\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.45.06-PM-891x1024.png 891w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.45.06-PM-261x300.png 261w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.45.06-PM-768x883.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.45.06-PM-624x717.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.45.06-PM.png 1180w\" sizes=\"auto, (max-width: 507px) 100vw, 507px\" \/><\/a><\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Model Fit Metrics<\/strong><\/span><\/p>\n<p>The <strong>value<\/strong> parameter from our optimizer output returned the sum of squared error. What if we wanted to return a model fit metric, such as AIC or r-squared, so that we can compare several models later on?<\/p>\n<p>Instead of using the sum of squared errors function, we can attempt to minimize AIC by writing our own AIC function.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\naic_func &lt;- function(df, a, b){\r\n  \r\n  # make predictions for HR\r\n  predictions &lt;- with(df, hr_from_h(H, a, b))\r\n  \r\n  # get model errors\r\n  errors &lt;- with(df, HR - predictions)\r\n  \r\n  # calculate AIC\r\n  aic &lt;- nrow(df)*(log(2*pi)+1+log((sum(errors^2)\/nrow(df)))) + ((length(c(a, b))+1)*2)\r\n  return(aic)\r\n  \r\n}\r\n<\/pre>\n<p>We can try out the new function on the fake data set we created above.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\naic_func(fake_df,\r\n             a = 3, \r\n             b = 1)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.47.33-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2600\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.47.33-PM.png\" alt=\"\" width=\"265\" height=\"117\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.47.33-PM.png 322w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.47.33-PM-300x132.png 300w\" sizes=\"auto, (max-width: 265px) 100vw, 265px\" \/><\/a><\/p>\n<p>Now, let&#8217;s run the optimizer with the AIC function instead of the sum of square error function.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\noptimizer_results &lt;- optim(par = c(0, 0),\r\n                           fn = function(x){\r\n                             aic_func(df, x&#x5B;1], x&#x5B;2])\r\n                             }\r\n                           )\r\n\r\noptimizer_results\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.48.26-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2601\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.48.26-PM.png\" alt=\"\" width=\"418\" height=\"389\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.48.26-PM.png 824w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.48.26-PM-300x279.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.48.26-PM-768x714.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.48.26-PM-624x580.png 624w\" sizes=\"auto, (max-width: 418px) 100vw, 418px\" \/><\/a><\/p>\n<p>We get the same coefficients with the difference being that the <strong>value<\/strong> parameter now returns AIC. We can check that this AIC compares to our original linear model.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nAIC(fit_lm)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.49.43-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2602\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.49.43-PM.png\" alt=\"\" width=\"175\" height=\"66\" \/><\/a><\/p>\n<p>If we instead wanted to obtain an r-squared, we can write an r-squared function. In the optimizer, since a higher r-squared is better, we need to indicate that we are wanting to maximize this value (<strong>optim<\/strong> defaluts to minimization). To do this we set the <strong>fnscale<\/strong> argument to -1. The only issue I have with this function is that it doesn&#8217;t return the coefficients properly in the <strong>par<\/strong> section of the results. Not sure what is going on here but if anyone has any ideas, please reach out. I am able to produce the exact result of r-squared from the linear model, however.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## r-squared function\r\nr_sq_func &lt;- function(df, a, b){\r\n  \r\n  # make predictions for HR\r\n  predictions &lt;- with(df, hr_from_h(H, a, b))\r\n  \r\n  # r-squared between predicted and actual HR values\r\n  r2 &lt;- cor(predictions, df$HR)^2\r\n  return(r2)\r\n  \r\n}\r\n\r\n## run optimizer\r\noptimizer_results &lt;- optim(par = c(0.1, 0.1),\r\n                           fn = function(x){\r\n                             r_sq_func(df, x&#x5B;1], x&#x5B;2])\r\n                             },\r\n      control = list(fnscale = -1)\r\n )\r\n\r\n## get results\r\noptimizer_results\r\n\r\n## Compare results to what was obtained from our linear regression\r\nsummary(fit_lm)$r.squared\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.51.42-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-2603\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.51.42-PM-996x1024.png\" alt=\"\" width=\"534\" height=\"549\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.51.42-PM-996x1024.png 996w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.51.42-PM-292x300.png 292w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.51.42-PM-768x789.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.51.42-PM-624x641.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/07\/Screen-Shot-2022-07-23-at-12.51.42-PM.png 1004w\" sizes=\"auto, (max-width: 534px) 100vw, 534px\" \/><\/a><\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Wrapping up<\/strong><\/span><\/p>\n<p>That was a short tutorial on writing an optimizer in R. There is a lot going on with these types of functions and they can get pretty complicated very quickly. I find that starting with a simple example and building from there is always useful. We additionally looked at having the optimizer return us various model fit metrics.<\/p>\n<p>If you notice any errors in the code, please reach out!<\/p>\n<p>Access to the full code is available on my <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/github.com\/pw2\/R-Tips-Tricks\/blob\/master\/Optimizer%20in%20R%20-%20Returning%20model%20fit%20metrics.Rmd\">GitHub page<\/a><\/span><\/strong>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction A colleague had asked me if I knew of a way to obtain model fit metrics, such as AIC or r-squared, from the optim() function. First, optim() provides a general-purpose method of optimizing an algorithm to identify the best weights for either minimizing or maximizing whatever success metric you are comparing your model to [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[47,45,43],"tags":[],"class_list":["post-2592","post","type-post","status-publish","format-standard","hentry","category-model-building-in-r","category-r-tips-tricks","category-sports-analytics"],"_links":{"self":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/2592","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/comments?post=2592"}],"version-history":[{"count":1,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/2592\/revisions"}],"predecessor-version":[{"id":2604,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/2592\/revisions\/2604"}],"wp:attachment":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/media?parent=2592"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/categories?post=2592"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/tags?post=2592"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}