{"id":3382,"date":"2024-03-24T23:03:30","date_gmt":"2024-03-24T23:03:30","guid":{"rendered":"http:\/\/optimumsportsperformance.com\/blog\/?p=3382"},"modified":"2024-03-25T12:25:15","modified_gmt":"2024-03-25T12:25:15","slug":"frequentist-bayesian-approaches-to-regression-models-by-hand-compare-and-contrast","status":"publish","type":"post","link":"https:\/\/optimumsportsperformance.com\/blog\/frequentist-bayesian-approaches-to-regression-models-by-hand-compare-and-contrast\/","title":{"rendered":"Frequentist &#038; Bayesian Approaches to Regression Models by Hand &#8211; Compare and Contrast"},"content":{"rendered":"<p>One of the ways I try and learn things is to code them from first principles. It helps me see what is going on under the hood and also allows me wrap my head around how things work. Building regression models in R is incredibly easy using the <strong>lm()<\/strong> function and Bayesian regression models can be conveniently built with the same syntax in packages like {<strong>rstanarm<\/strong>} and {<strong>brms<\/strong>}. However, today I&#8217;m going to do everything by hand to get a grasp of how the Bayesian regression model works, at a very basic level. I put together the code using the mathematical presentation of these concepts from William Bolstad&#8217;s book, <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/www.amazon.com\/Introduction-Bayesian-Statistics-William-Bolstad\/dp\/1118091566\">Introduction to Bayesian Statistics<\/a><\/span><\/strong>.<\/p>\n<p>The entire script is available on my <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/github.com\/pw2\/frequentist_bayes_regression_by_hand\">GITHUB page<\/a><\/span><\/strong>.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Loading Packages &amp; Getting Data<\/strong><\/span><\/p>\n<p>I&#8217;m going to use the data from the {<strong>palmerpenguins<\/strong>} package and concentrate on a simple linear regression which will estimate flipper length from bill length.<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\n### packages ------------------------------------------------------\r\nlibrary(tidyverse)\r\nlibrary(palmerpenguins)\r\nlibrary(patchwork)\r\n\r\ntheme_set(theme_classic())\r\n\r\n### data ----------------------------------------------------------\r\ndata(&quot;penguins&quot;)\r\ndat &lt;- penguins %&gt;%\r\n  select(bill_length = bill_length_mm, flipper_length = flipper_length_mm) %&gt;%\r\n  na.omit()\r\n\r\nhead(dat)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.36.29\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3383\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.36.29\u202fPM.png\" alt=\"\" width=\"365\" height=\"295\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.36.29\u202fPM.png 488w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.36.29\u202fPM-300x242.png 300w\" sizes=\"auto, (max-width: 365px) 100vw, 365px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>EDA<\/strong><\/span><\/p>\n<p>These two variables share a relatively large correlation with each other.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.08\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3384\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.08\u202fPM.png\" alt=\"\" width=\"356\" height=\"315\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.08\u202fPM.png 880w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.08\u202fPM-300x265.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.08\u202fPM-768x679.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.08\u202fPM-624x552.png 624w\" sizes=\"auto, (max-width: 356px) 100vw, 356px\" \/><\/a> <a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.14\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3385\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.14\u202fPM.png\" alt=\"\" width=\"440\" height=\"221\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.14\u202fPM.png 944w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.14\u202fPM-300x151.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.14\u202fPM-768x386.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.37.14\u202fPM-624x313.png 624w\" sizes=\"auto, (max-width: 440px) 100vw, 440px\" \/><\/a><\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Ordinary Least Squares Regression<\/strong><\/span><\/p>\n<p>First, I&#8217;ll start by building a regression model under the Frequentist paradigm, specifying no prior values on the parameters (even though in reality the OLS model is using a flat prior, where all values are equally plausible &#8212; a discussion for a different time), using the <strong>lm()<\/strong> function.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n### Ordinary Least Squares Regression ------------------------------\r\nfit_ols &lt;- lm(flipper_length ~ I(bill_length - mean(dat$bill_length)), data = dat)\r\nsummary(fit_ols)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.40.10\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3386\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.40.10\u202fPM-1024x605.png\" alt=\"\" width=\"425\" height=\"251\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.40.10\u202fPM-1024x605.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.40.10\u202fPM-300x177.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.40.10\u202fPM-768x454.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.40.10\u202fPM-624x369.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.40.10\u202fPM.png 1316w\" sizes=\"auto, (max-width: 425px) 100vw, 425px\" \/><\/a><\/p>\n<p><em><strong>Technical Note: <\/strong>Since we don&#8217;t observe a bill length of 0 mm in penguins, I chose to transform the bill length data by grand mean centering it &#8212; subtracting each bill length from the population mean. This wont change the predictions from the model but does change our interpretation of the coefficients themselves (and the intercept is now directly interpretable, which it wouldn&#8217;t be if we left the bill length data in its untrasnformed state).<\/em><\/p>\n<p><span style=\"text-decoration: underline;\"><strong>OLS Regression by Hand<\/strong><\/span><\/p>\n<p>Okay, that was easy. In a single line we constructed a model and we are up and running. But, let&#8217;s calculate this by hand so we can see where the coefficients and their corresponding standard errors came from.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n# Calculate the least squares regression line by hand now\r\ndat_stats &lt;- dat %&gt;%\r\n  summarize(x_bar = mean(bill_length),\r\n            x_sd = sd(bill_length),\r\n            y_bar = mean(flipper_length),\r\n            y_sd = sd(flipper_length),\r\n            r = cor(bill_length, flipper_length),\r\n            x2 = x_bar^2,\r\n            y2 = y_bar^2,\r\n            xy_bar = x_bar * y_bar,\r\n            .groups = &quot;drop&quot;)\r\n\r\ndat_stats\r\n<\/pre>\n<p>In the above code, I&#8217;m calculating summary statistics (mean and SD) for both the independent and dependent variables, extracting their correlation coefficient, and getting their squared means and the product of their squared means to be used in downstream calculations of the regression coefficients. Here is what the output looks like:<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.52.34\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3387\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.52.34\u202fPM.png\" alt=\"\" width=\"468\" height=\"111\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.52.34\u202fPM.png 842w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.52.34\u202fPM-300x71.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.52.34\u202fPM-768x182.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.52.34\u202fPM-624x148.png 624w\" sizes=\"auto, (max-width: 468px) 100vw, 468px\" \/><\/a><\/p>\n<p>The model intercept is simply the mean of our outcome variable (flipper length).<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nintercept &lt;- dat_stats$y_bar\r\n<\/pre>\n<p>The coefficient for our independent variable is calculated as the correlation coefficient multiplied by the the SD of the y variable divided by the SD of the x variable.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nbeta &lt;- with(dat_stats,\r\n             r * (y_sd \/ x_sd))\r\n\r\nbeta\r\n<\/pre>\n<p>Finally, I&#8217;ll store the grand mean of bill length so that we can use it when we need to center bill length and make predictions.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nx_bar &lt;- dat_stats$x_bar\r\nx_bar\r\n<\/pre>\n<p>Now that we have the intercept and beta coefficient for bill length calculated by hand we can construct the regression equation:<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.55.46\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3388\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.55.46\u202fPM.png\" alt=\"\" width=\"1000\" height=\"96\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.55.46\u202fPM.png 1000w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.55.46\u202fPM-300x29.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.55.46\u202fPM-768x74.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.55.46\u202fPM-624x60.png 624w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" \/><\/a><\/p>\n<p>Notice that these are the exact values we obtained from our model using the <strong>lm()<\/strong> function.<\/p>\n<p>We can use the two models to make predictions and show that they are the exact same.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## Make predictions with the two models\r\ndat %&gt;%\r\n  mutate(pred_model = predict(fit_ols),\r\n         pred_hand = intercept + beta * (bill_length - x_bar))\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.57.49\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3389\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.57.49\u202fPM.png\" alt=\"\" width=\"510\" height=\"300\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.57.49\u202fPM.png 842w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.57.49\u202fPM-300x177.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.57.49\u202fPM-768x452.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-2.57.49\u202fPM-624x368.png 624w\" sizes=\"auto, (max-width: 510px) 100vw, 510px\" \/><\/a><\/p>\n<p>In the model estimated from the <strong>lm()<\/strong> function we also get a sigma parameter (residual squared error, RSE), which tells us, on average, how off the model predictions are. We can build this by hand by first calculating the squared error of the observed values and our predictions and then calculating the RSE as the square root of the sum of squared residuals divided by the model degrees of freedom.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n# Calculate the estimated variance around the line\r\nN_obs &lt;- nrow(dat) dat %&gt;%\r\n  mutate(pred = intercept + beta * (bill_length - x_bar),\r\n         resid = (flipper_length - pred),\r\n         resid2 = resid^2) %&gt;%\r\n  summarize(n_model_params = 2,\r\n            deg_freedom = N_obs - n_model_params,\r\n            model_var = sum(resid2) \/ deg_freedom,\r\n            model_sd = sqrt(model_var))\r\n<\/pre>\n<p>Again, our by hand calculations equal exactly as what we obtained from the <strong>lm() <\/strong>function, 10.6.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.16\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3390\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.16\u202fPM.png\" alt=\"\" width=\"486\" height=\"98\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.16\u202fPM.png 774w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.16\u202fPM-300x60.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.16\u202fPM-768x155.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.16\u202fPM-624x126.png 624w\" sizes=\"auto, (max-width: 486px) 100vw, 486px\" \/><\/a> <a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.25\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3391\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.25\u202fPM.png\" alt=\"\" width=\"481\" height=\"67\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.25\u202fPM.png 896w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.25\u202fPM-300x42.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.25\u202fPM-768x106.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.01.25\u202fPM-624x86.png 624w\" sizes=\"auto, (max-width: 481px) 100vw, 481px\" \/><\/a><\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Bayesian Linear Regression by Hand<\/strong><\/span><\/p>\n<p>Now let&#8217;s turn our attention to building a Bayesian regression model.<\/p>\n<p>First we need to calculate the sum of squared error for X, which we do with this equation:<\/p>\n<blockquote><p><strong>ss_x = N * (mean(mu_x^2) &#8211; mu_x^2)<\/strong><\/p><\/blockquote>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nN &lt;- nrow(dat)\r\nmean_x2 &lt;- mean(dat$bill_length^2)\r\nmu_x2 &lt;- mean(dat$bill_length)^2\r\n\r\nN\r\nmean_x2\r\nmu_x2\r\n\r\nss_x &lt;- N * (mean_x2 - mu_x2)\r\nss_x\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.08.32\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3393\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.08.32\u202fPM.png\" alt=\"\" width=\"368\" height=\"254\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.08.32\u202fPM.png 554w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.08.32\u202fPM-300x207.png 300w\" sizes=\"auto, (max-width: 368px) 100vw, 368px\" \/><\/a><\/p>\n<p>Next, we need to specify some priors on the model parameters. The priors can come from a number of courses. We could set them subjectively (usually people hate this idea). Or, we could look in the scientific literature and use prior research as a guide for plausible values of the relationship between flipper length and bill length. In this case, I&#8217;m just going to specify some priors that are within the range of the data but have enough variance around them to <em>&#8220;let the data speak&#8221;.<\/em> (<strong>NOTE:<\/strong> you could rerun the entire analysis with different priors and smaller or larger variances to see how the model changes. I hope to do a longer blog post about priors in the future).<\/p>\n<ul>\n<li>For the slope coefficient we decide to have a normal prior, N(1, 2^2)<\/li>\n<li>For the intercept coefficient we choose a normal prior, N(180, 10^2)<\/li>\n<li>We don&#8217;t know the true variance so we use the estimated variance from the least squares regression line<\/li>\n<\/ul>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nprior_model_var &lt;- sigma(fit_ols)^2\r\n\r\nprior_slope_mu &lt;- 1\r\nprior_slope_var &lt;- 1^2\r\n\r\nprior_intercept_mu &lt;- 180\r\nprior_intercept_var &lt;- 10^2\r\n<\/pre>\n<p>Next, we calculate the posterior precision (1\/variance) for the slope coefficient. We do it with this equation:<\/p>\n<blockquote><p><strong>1\/prior_slope_var + (ss_x \/ prior_model_var)<\/strong><\/p><\/blockquote>\n<p>And then convert it to a standard deviation (which is more useful to us and easier to interpret than precision, since it is on the scale of the data).<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n# 1\/prior_slope_var + (ss_x \/ prior_model_var)\r\nposterior_slope_precision &lt;- 1 \/ prior_slope_var + ss_x \/ prior_model_var\r\n\r\n# Convert to SD\r\nposterior_slope_sd &lt;- posterior_slope_precision^-(1\/2)\r\n<\/pre>\n<p>Once we have the precision for the posterior slope calculated we can calculate the posterior regression coefficient for the slope using this equation:<\/p>\n<blockquote><p><strong>(1\/prior_slope_var) \/ posterior_slope_var * prior_slope_mu + (ss_x \/ prior_model_var) \/ posterior_slope_var * beta<\/strong><\/p><\/blockquote>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n# posterior slope\r\nposterior_slope_mu &lt;- (1\/prior_slope_var) \/ posterior_slope_precision * prior_slope_mu + (ss_x \/ prior_model_var) \/ posterior_slope_precision * beta\r\nposterior_slope_mu\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.18.44\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3394\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.18.44\u202fPM.png\" alt=\"\" width=\"271\" height=\"63\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.18.44\u202fPM.png 344w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.18.44\u202fPM-300x70.png 300w\" sizes=\"auto, (max-width: 271px) 100vw, 271px\" \/><\/a><\/p>\n<p>We can plot the prior and posterior slope values, by first simulating the posterior slope using its mu and SD values.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## Plot prior and posterior for the slope\r\nset.seed(4)\r\nprior_sim &lt;- rnorm(n = 1e4, mean = prior_slope_mu, sd = sqrt(prior_slope_var))\r\nposterior_sim &lt;- rnorm(n = 1e4, mean = posterior_slope_mu, sd = posterior_slope_sd)\r\n\r\nplot(density(posterior_sim),\r\n     col = &quot;blue&quot;,\r\n     lwd = 4,\r\n     xlim = c(-2, 3),\r\n     main = &quot;Prior &amp; Posterior\\nfor\\nBayesian Regression Slope Coefficient&quot;)\r\nlines(density(prior_sim),\r\n      col = &quot;red&quot;,\r\n      lty = 2,\r\n      lwd = 4)\r\nlegend(&quot;topleft&quot;,\r\n       legend = c(&quot;Prior&quot;, &quot;Posterior&quot;),\r\n       col = c(&quot;red&quot;, &quot;blue&quot;),\r\n       lty = c(2, 1),\r\n       lwd = c(2,2))\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.19.33\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3395\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.19.33\u202fPM-1024x693.png\" alt=\"\" width=\"625\" height=\"423\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.19.33\u202fPM-1024x693.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.19.33\u202fPM-300x203.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.19.33\u202fPM-768x520.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.19.33\u202fPM-624x422.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.19.33\u202fPM.png 1144w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>We notice that the prior for the slope, <em><strong>N(1, 1^2)<\/strong><\/em>, had a rather broad range of plausible values. However, after observing some data and combining that observed data with our prior, we find the posterior to have settled into a reasonable range given the value, Again, if you had selected other priors or perhaps a prior with a much narrower variance, these results would be different and potentially more influenced by your prior.<\/p>\n<p>Now that we have a slope parameter we need to calculate the intercept. The equations are commented out in the code below.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## Posterior precision for the intercept\r\n# 1\/prior_intercept_var + N\/prior_model_var\r\n\r\nposterior_intercept_precision &lt;- 1\/prior_intercept_var + N\/prior_model_var\r\nposterior_intercept_precision\r\n\r\n# Convert to SD\r\nposterior_intercept_sd &lt;- posterior_intercept_precision^-(1\/2)\r\nposterior_intercept_sd\r\n\r\n## Posterior intercept mean\r\n# (1\/prior_intercept_var) \/ posterior_intercept_precision * prior_intercept_mu + (N\/prior_model_var) \/ posterior_intercept_precision * intercept\r\nposterior_intercept_mu &lt;- (1\/prior_intercept_var) \/ posterior_intercept_precision * prior_intercept_mu + (N\/prior_model_var) \/ posterior_intercept_precision * intercept\r\nposterior_intercept_mu\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.26.47\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3396\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.26.47\u202fPM.png\" alt=\"\" width=\"415\" height=\"364\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.26.47\u202fPM.png 842w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.26.47\u202fPM-300x263.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.26.47\u202fPM-768x673.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.26.47\u202fPM-624x547.png 624w\" sizes=\"auto, (max-width: 415px) 100vw, 415px\" \/><\/a><\/p>\n<p>Comparing our Bayesian regression coefficients and standard errors to the ones we obtained from the <strong>lm()<\/strong> function, we find that they are nearly identical.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.29.49\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3397\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.29.49\u202fPM.png\" alt=\"\" width=\"491\" height=\"427\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.29.49\u202fPM.png 898w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.29.49\u202fPM-300x261.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.29.49\u202fPM-768x667.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.29.49\u202fPM-624x542.png 624w\" sizes=\"auto, (max-width: 491px) 100vw, 491px\" \/><\/a><\/p>\n<p>Using the coefficients and their corresponding standard errors we can also compare the 95% Credible Interval from the Bayesian model to the 95% Confidence Interval from the OLS model (again, nearly identical).<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.31.16\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3398\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.31.16\u202fPM-1024x422.png\" alt=\"\" width=\"489\" height=\"202\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.31.16\u202fPM-1024x422.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.31.16\u202fPM-300x124.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.31.16\u202fPM-768x316.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.31.16\u202fPM-624x257.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.31.16\u202fPM.png 1044w\" sizes=\"auto, (max-width: 489px) 100vw, 489px\" \/><\/a><\/p>\n<p>And writing out the Bayesian model we see that it is the same as the Frequentist model.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.32.34\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3399\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.32.34\u202fPM-1024x89.png\" alt=\"\" width=\"625\" height=\"54\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.32.34\u202fPM-1024x89.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.32.34\u202fPM-300x26.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.32.34\u202fPM-768x67.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.32.34\u202fPM-624x54.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.32.34\u202fPM.png 1606w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>Using the posterior mean and SDs for the slope and intercept we can simulate a distribution and plot and summarize them using quantile intervals and credible intervals, to get a picture of our uncertainty.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## Use simulation for the slope and intercept\r\nset.seed(413)\r\nposterior_intercept_sim &lt;- rnorm(n = 1e4, mean = posterior_intercept_mu, sd = posterior_slope_sd)\r\nposterior_slope_sim &lt;- rnorm(n = 1e4, mean = posterior_slope_mu, sd = posterior_slope_sd)\r\n\r\npar(mfrow = c(1, 2))\r\nhist(posterior_intercept_sim)\r\nhist(posterior_slope_sim)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.34.07\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3400\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.34.07\u202fPM-1024x538.png\" alt=\"\" width=\"625\" height=\"328\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.34.07\u202fPM-1024x538.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.34.07\u202fPM-300x158.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.34.07\u202fPM-768x404.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.34.07\u202fPM-624x328.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.34.07\u202fPM.png 1690w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a> <a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.35.01\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3401\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.35.01\u202fPM-1024x249.png\" alt=\"\" width=\"625\" height=\"152\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.35.01\u202fPM-1024x249.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.35.01\u202fPM-300x73.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.35.01\u202fPM-768x187.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.35.01\u202fPM-624x152.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.35.01\u202fPM.png 1596w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Making Predictions<\/strong><\/span><\/p>\n<p>Now that we have our models specified it&#8217;s time to make some predictions!<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## Add all predictions to the original data set\r\ndat_final &lt;- dat %&gt;%\r\n  mutate(pred_ols = predict(fit_ols),\r\n         pred_ols_by_hand = intercept + beta * (bill_length - x_bar),\r\n         pred_bayes_by_hand = posterior_intercept_mu + posterior_slope_mu * (bill_length - x_bar))\r\n\r\nhead(dat_final, 10)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.36.30\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3402\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.36.30\u202fPM-1024x459.png\" alt=\"\" width=\"556\" height=\"249\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.36.30\u202fPM-1024x459.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.36.30\u202fPM-300x134.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.36.30\u202fPM-768x344.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.36.30\u202fPM-624x280.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.36.30\u202fPM.png 1196w\" sizes=\"auto, (max-width: 556px) 100vw, 556px\" \/><\/a><\/p>\n<p>Because the equations were pretty much identical we can see that all three models (OLS with <strong>lm()<\/strong>, OLS by hand, and Bayes by hand) produce the same predicted values for flipper length.<\/p>\n<p>We can also visualize these predictions along with the prediction intervals.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## Predictions with uncertainty from the OLS model\r\nols_preds &lt;- dat_final %&gt;%\r\n  bind_cols(predict(fit_ols, newdata = ., interval = &quot;prediction&quot;)) %&gt;%\r\n  ggplot(aes(x = fit, y = flipper_length)) +\r\n  geom_point(size = 2) +\r\n  geom_ribbon(aes(ymin = lwr, ymax = upr),\r\n              alpha = 0.4) +\r\n  geom_smooth(method = &quot;lm&quot;,\r\n              se = FALSE) +\r\n  ggtitle(&quot;OLS Predictions with 95% Prediction Intervals&quot;)\r\n\r\n## Now predict with the Bayesian Model\r\n# The error is calculated as:\r\n# residual_variance = prior_model_var +  posterior_intercept_sd^2 + posterior_slope^2*(x - x_bar)\r\nbayes_preds &lt;- dat %&gt;%\r\n  mutate(fit = posterior_intercept_mu + posterior_slope_mu * (bill_length-x_bar),\r\n         error_var = prior_model_var + posterior_intercept_sd^2 + posterior_slope_sd^2 * (bill_length - x_bar),\r\n         rse = sqrt(error_var),\r\n         lwr = fit - qt(p = 0.975, df = N - 2) * rse,\r\n         upr = fit + qt(p = 0.975, df = N - 2) * rse) %&gt;%\r\n  ggplot(aes(x = fit, y = flipper_length)) +\r\n  geom_point(size = 2) +\r\n  geom_ribbon(aes(ymin = lwr, ymax = upr),\r\n              alpha = 0.4) +\r\n  geom_smooth(method = &quot;lm&quot;,\r\n              se = FALSE) +\r\n  ggtitle(&quot;Bayesian Predictions with 95% Prediction Intervals&quot;)\r\n\r\n\r\nols_preds | bayes_preds\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.38.37\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3404\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.38.37\u202fPM-1024x626.png\" alt=\"\" width=\"625\" height=\"382\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.38.37\u202fPM-1024x626.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.38.37\u202fPM-300x183.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.38.37\u202fPM-768x469.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.38.37\u202fPM-624x381.png 624w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>Okay, I know what you are thinking. The equations were the same, the predictions are the same, and the prediction intervals are the same&#8230;.<strong>What&#8217;s the big deal?!<\/strong><\/p>\n<p>The big deal is that the Bayesian model gives us more flexibility. Not only can we specify priors, allowing us to have a higher weighting on more plausible values (based on prior data, prior research, domain expertise, etc.); but, we can also produce an entire posterior distribution for the predictions.<\/p>\n<p>Let&#8217;s take a single row of observation from the data and make a prediction on it, complete with 95% credible intervals, using our Bayesian model.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nsingle_obs &lt;- dat %&gt;% slice(36)\r\nsingle_obs\r\n\r\npred_bayes &lt;- single_obs %&gt;%\r\n  mutate(fit = posterior_intercept_mu + posterior_slope_mu * (bill_length-x_bar),\r\n         error_var = prior_model_var + posterior_intercept_sd^2 + posterior_slope_sd^2 * (bill_length - x_bar),\r\n         rse = sqrt(error_var),\r\n         lwr = fit - qt(p = 0.975, df = N - 2) * rse,\r\n         upr = fit + qt(p = 0.975, df = N - 2) * rse) \r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.43.26\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3405\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.43.26\u202fPM.png\" alt=\"\" width=\"1006\" height=\"190\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.43.26\u202fPM.png 1006w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.43.26\u202fPM-300x57.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.43.26\u202fPM-768x145.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.43.26\u202fPM-624x118.png 624w\" sizes=\"auto, (max-width: 1006px) 100vw, 1006px\" \/><\/a><\/p>\n<p>Next, we simulate 10,000 observations using the mean prediction (fit) and the rse from the table above and summarize the results<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nset.seed(582)\r\npred_bayes_sim &lt;- rnorm(n = 1e4, mean = pred_bayes$fit, sd = pred_bayes$rse)\r\n\r\nmean(pred_bayes_sim)\r\nquantile(pred_bayes_sim, probs = c(0.5, 0.025, 0.975))\r\nmean(pred_bayes_sim) + qt(p = c(0.025, 0.975), df = N - 2) * sd(pred_bayes_sim)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.44.36\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3406\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.44.36\u202fPM-1024x303.png\" alt=\"\" width=\"625\" height=\"185\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.44.36\u202fPM-1024x303.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.44.36\u202fPM-300x89.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.44.36\u202fPM-768x227.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.44.36\u202fPM-624x184.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.44.36\u202fPM.png 1320w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>Our mean value, the 95% quantile intervals, and then 95% credible intervals are the similar to the table above (they should be since we simulated with those parameters). We will also get similar values if we use our OLS model and make a prediction on this observation with prediction intervals.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.46.41\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-3407\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.46.41\u202fPM-1024x117.png\" alt=\"\" width=\"625\" height=\"71\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.46.41\u202fPM-1024x117.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.46.41\u202fPM-300x34.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.46.41\u202fPM-768x88.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.46.41\u202fPM-624x71.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.46.41\u202fPM.png 1048w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n<p>But, because we simulated a full distribution, from the Bayesian prediction we also get 10,000 plausible values for the estimate of flipper length based on the observed bill length.<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.48.09\u202fPM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3408\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.48.09\u202fPM-1024x772.png\" alt=\"\" width=\"426\" height=\"321\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.48.09\u202fPM-1024x772.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.48.09\u202fPM-300x226.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.48.09\u202fPM-768x579.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.48.09\u202fPM-624x471.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2024\/03\/Screenshot-2024-03-24-at-3.48.09\u202fPM.png 1334w\" sizes=\"auto, (max-width: 426px) 100vw, 426px\" \/><\/a><\/p>\n<p>This offers us flexibility to make more direct statements about the predictive distribution. For example, we might ask, <em>&#8220;What is the probability that the predicted flipper length is greater than 210 mm?&#8221;<\/em> and we could directly compute this from the simulated data! This provides us with a number of options if we were using this model to make decisions on and perhaps have threshold values or ranges of critical importance where we need to know how much of the density of our predicted distribution is below or above the threshold or falls outside of the region of importance.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Wrapping Up<\/strong><\/span><\/p>\n<p>That&#8217;s a little bit on working with regression models by hand, from first principles. The Bayesian approach affords us a lot of control and flexibility over the model, via priors (more on that in a future blog post) and the predictions (via simulation). The Bayesian regression model calculated here was a simple approach and did not require Markov Chain Monte Carlo (MCMC), which would be run when using {<strong>rstanarm<\/strong>} and {<strong>brms<\/strong>}. If you are curious how this works, I wrote a previous blog on using a GIBBS Sampler to estimate the posterior distributions for a regression model, <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/optimumsportsperformance.com\/blog\/bayesian-simple-linear-regression-by-hand-gibbs-sampler\/\">CLICK HERE<\/a><\/span><\/strong>. But, even with this simple approach we are afforded the ability to specify priors on the model parameters of the deterministic\/fixed component of the model. We did not specify a prior on the stochastic\/error part of the model (we used the sigma value calculated from the OLS model &#8212; which we also calculated by hand). This was necessary in this simple conjugate approach because the normal distribution is a two parameter distribution so we needed to treat one of the parameters as fixed (in this case the SD). Had we used an MCMC approach, we could have further specified a prior distribution on the model error.<\/p>\n<p>If you notice any errors, feel free to drop me an email.<\/p>\n<p>The code is available on my <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/github.com\/pw2\/frequentist_bayes_regression_by_hand\">GITHUB page<\/a><\/span><\/strong>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the ways I try and learn things is to code them from first principles. It helps me see what is going on under the hood and also allows me wrap my head around how things work. Building regression models in R is incredibly easy using the lm() function and Bayesian regression models can [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[49,47],"tags":[],"class_list":["post-3382","post","type-post","status-publish","format-standard","hentry","category-bayesian-model-building","category-model-building-in-r"],"_links":{"self":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/3382","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/comments?post=3382"}],"version-history":[{"count":2,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/3382\/revisions"}],"predecessor-version":[{"id":3410,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/3382\/revisions\/3410"}],"wp:attachment":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/media?parent=3382"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/categories?post=3382"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/tags?post=3382"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}