{"id":2440,"date":"2022-05-19T04:46:02","date_gmt":"2022-05-19T04:46:02","guid":{"rendered":"http:\/\/optimumsportsperformance.com\/blog\/?p=2440"},"modified":"2022-11-08T03:36:34","modified_gmt":"2022-11-08T03:36:34","slug":"neural-networks-its-regression-all-the-way-down","status":"publish","type":"post","link":"https:\/\/optimumsportsperformance.com\/blog\/neural-networks-its-regression-all-the-way-down\/","title":{"rendered":"Neural Networks&#8230;.It&#8217;s regression all the way down!"},"content":{"rendered":"<p><span style=\"color: #0000ff;\"><strong><a style=\"color: #0000ff;\" href=\"https:\/\/optimumsportsperformance.com\/blog\/t-test-anova-its-linear-regression-all-the-way-down\/\">Yesterday<\/a><\/strong><\/span>, I talked about how t-test and ANOVA are fundamentally just linear regression. But what about something more complex? What about something like a neural network?<\/p>\n<p>Whenever people bring up neural networks I always say, <em>&#8220;The most basic neural network is a sigmoid function. It&#8217;s just logistic regression!!&#8221;<\/em> Of course, neural networks can get very complex and there is a lot more than can be added to them to maximize their ability to do the job. But fundamentally, they look like regression models and\u00a0 when you add several hidden layers (deep learning) you end up just stacking a bunch of regression models on top of each other (I know I&#8217;m over simplifying this a little bit).<\/p>\n<p>Let&#8217;s see if we can build a simple neural network to prove it. As always, you can access the full code on my <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/github.com\/pw2\/R-Tips-Tricks\/blob\/master\/Neural%20Networks...It's%20regression%20all%20the%20way%20down!.R\">GITHUB page<\/a><\/span><\/strong>.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Loading packages, functions, and data<\/strong><\/span><\/p>\n<p>We will load {<strong>tidyverse<\/strong>} for data cleaning, {<strong>neuralnet<\/strong>} for building our neural network, and {<strong>mlbench<\/strong>} to access the Boston housing data.<\/p>\n<p>I create a z-score function that will be used to standardize the features for our model. We will keep this simple and attempt to predict Boston housing prices (mdev) using three features (rm, dis, indus). To read more about what these features are, in your R console type <strong>?BostonHousing. <\/strong>Once we&#8217;ve selected those features out of the data set, we apply our z-score function to them.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## Load packages\r\nlibrary(tidyverse)\r\nlibrary(neuralnet)\r\nlibrary(mlbench)\r\n\r\n## z-score function\r\nz_score &lt;- function(x){\r\n  z &lt;- (x - mean(x, na.rm = TRUE)) \/ sd(x, na.rm = TRUE)\r\n  return(z)\r\n}\r\n\r\n## get data\r\ndata(&quot;BostonHousing&quot;)\r\n\r\n## z-score features\r\nd &lt;- BostonHousing %&gt;%\r\n  select(medv, rm, dis, indus) %&gt;%\r\n  mutate(across(.cols = rm:indus,\r\n                ~z_score(.x),\r\n                .names = &quot;{.col}_z&quot;))\r\n  \r\nd %&gt;% \r\n  head()\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.20.26-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2441\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.20.26-PM.png\" alt=\"\" width=\"533\" height=\"208\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.20.26-PM.png 816w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.20.26-PM-300x117.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.20.26-PM-768x299.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.20.26-PM-624x243.png 624w\" sizes=\"auto, (max-width: 533px) 100vw, 533px\" \/><\/a><\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Train\/Test Split<\/strong><\/span><\/p>\n<p>There isn&#8217;t much of a train\/test split here because I&#8217;m not building a full model to be tested. I&#8217;m really just trying to show how a neural network works. Thus, I&#8217;ll select the first row of data as my &#8220;test&#8221; set and retain the rest of the data for training the model.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## remove first observation for making a prediction on after training\r\ntrain &lt;- d&#x5B;-1, ]\r\ntest &lt;- d&#x5B;1, ]\r\n<\/pre>\n<p><span style=\"text-decoration: underline;\"><strong>Neural Network Model<\/strong><\/span><\/p>\n<p>We build a simple model with 1 hidden layer and then plot the output. In the plot we see various numbers. The numbers in black refer to our weights and the numbers in blue refer to the biases.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## Simple neural network with one hidden layer\r\nset.seed(9164)\r\nfit_nn &lt;- neuralnet(medv ~ rm_z + dis_z + indus_z,\r\n                    data = train,\r\n                    hidden = 1,\r\n                    err.fct = &quot;sse&quot;,\r\n                    linear.output = TRUE)\r\n\r\n## plot the neural network\r\nplot(fit_nn)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.24.42-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2442\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.24.42-PM-1024x770.png\" alt=\"\" width=\"532\" height=\"400\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.24.42-PM-1024x770.png 1024w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.24.42-PM-300x225.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.24.42-PM-768x577.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.24.42-PM-624x469.png 624w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.24.42-PM.png 1634w\" sizes=\"auto, (max-width: 532px) 100vw, 532px\" \/><\/a><\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Making Predictions &#8212; It&#8217;s linear regression all the way down!<\/strong><\/span><\/p>\n<p>As stated above, we have weights (black numbers) and biases (blue numbers). If we are trying to frame up the neural network as being a bunch of stacked together linear regressions then we can think about the weights as functioning like regression coefficients and the biases functioning like the linear model intercept.<\/p>\n<p>Let&#8217;s take each variable from the plot and store them in their own elements so that we can apply them directly to our test observation and write out the equation by hand.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## Predictions are formed using the weights (black) and biases\r\n# Store the weights and biases from the plot and put them into their own elements\r\nrm_weight &lt;- 1.09872\r\ndis_weight &lt;- -0.05993\r\nindus_weight &lt;- -0.49887\r\n\r\n# There is also a bias in the hidden layer\r\nhidden_weight &lt;- 35.95032\r\n\r\nbias_1 &lt;- -1.68717\r\nbias_2 &lt;- 14.85824\r\n<\/pre>\n<p>With everything stored, we are ready to make a prediction on the test observations<\/p>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.29.27-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2443\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.29.27-PM.png\" alt=\"\" width=\"504\" height=\"96\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.29.27-PM.png 796w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.29.27-PM-300x57.png 300w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.29.27-PM-768x147.png 768w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.29.27-PM-624x119.png 624w\" sizes=\"auto, (max-width: 504px) 100vw, 504px\" \/><\/a><\/p>\n<p>We begin at the input layer by multiplying each z-scored value by the corresponding weight from the plot above. We sum those together and add in the first bias &#8212; just like we would with a linear regression.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n# Start by applying the weights to their z-scored values, sum them together and add\r\n# in the first bias\r\ninput &lt;- bias_1 + test$rm_z * rm_weight + test$dis_z * dis_weight + test$indus_z * indus_weight\r\ninput\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.31.42-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2444\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.31.42-PM.png\" alt=\"\" width=\"165\" height=\"55\" \/><\/a><\/p>\n<p>One <em>regression<\/em> down, one more to go! But before we can move to the next <em>regression<\/em>, we need to transform this input value. The neural network is using a sigmoid function to make this transformation as the input value moves through the hidden layer. So, we should apply the sigmoid function before moving on.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n# transform input -- the neural network is using a sigmoid function\r\ninput_sig &lt;- 1\/(1+exp(-input))\r\ninput_sig\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.33.57-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2445\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.33.57-PM.png\" alt=\"\" width=\"155\" height=\"60\" \/><\/a><br \/>\nWe take this transformed input and multiply it by the hidden weight and then add it to the second bias. This final <em>regression<\/em> equation produces the predicted value of the home.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nprediction &lt;- bias_2 + input_sig * hidden_weight\r\nprediction\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.36.19-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2446\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.36.19-PM.png\" alt=\"\" width=\"181\" height=\"66\" \/><\/a><br \/>\nThe prediction here is in the thousands, relative to census data from the 1970&#8217;s.<\/p>\n<p>Let&#8217;s compare the prediction we got by hand to what we get when we run the <strong>predict()<\/strong> function.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n## Compare the output to what we would get if we used the predict() function\r\npredict(fit_nn, newdata = test)\r\n<\/pre>\n<p><a href=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.39.11-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2447\" src=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.39.11-PM.png\" alt=\"\" width=\"365\" height=\"81\" srcset=\"https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.39.11-PM.png 496w, https:\/\/optimumsportsperformance.com\/blog\/wp-content\/uploads\/2022\/05\/Screen-Shot-2022-05-18-at-9.39.11-PM-300x67.png 300w\" sizes=\"auto, (max-width: 365px) 100vw, 365px\" \/><\/a>Same result!<\/p>\n<p>Again, if you&#8217;d like the full code, you can access it on by <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/github.com\/pw2\/R-Tips-Tricks\/blob\/master\/Neural%20Networks...It's%20regression%20all%20the%20way%20down!.R\">GITHUB page<\/a><\/span><\/strong>.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Wrapping Up<\/strong><\/span><\/p>\n<p>Yesterday we talked about always thinking <strong>regression<\/strong> whenever you see a t-test or ANOVA. Today, we learn that we can think <strong>regression<\/strong> whenever we see a neural network, as well! By stacking two <em>regression-like<\/em> equations together we produced a neural network prediction. Imagine if we stacked 20 hidden layers together!<\/p>\n<p>The big take home, though, is that regression is super powerful. Fundamentally, it is the workhorse that helps to drive a number of other machine learning approaches. Imagine if you spent a few years really studying regression models? Imagine what you could learn about data analysis? If you&#8217;re up for it, one of my all time favorite books on the topic is <strong><span style=\"color: #0000ff;\"><a style=\"color: #0000ff;\" href=\"https:\/\/www.amazon.com\/Regression-Stories-Analytical-Methods-Research\/dp\/110702398X\">Gelman, Hill, and Vehtari&#8217;s Regression and Other Stories<\/a><\/span><\/strong>. I can&#8217;t recommend it enough!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Yesterday, I talked about how t-test and ANOVA are fundamentally just linear regression. But what about something more complex? What about something like a neural network? Whenever people bring up neural networks I always say, &#8220;The most basic neural network is a sigmoid function. It&#8217;s just logistic regression!!&#8221; Of course, neural networks can get very [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[47,45,43,42],"tags":[],"class_list":["post-2440","post","type-post","status-publish","format-standard","hentry","category-model-building-in-r","category-r-tips-tricks","category-sports-analytics","category-sports-science"],"_links":{"self":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/2440","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/comments?post=2440"}],"version-history":[{"count":4,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/2440\/revisions"}],"predecessor-version":[{"id":2451,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/posts\/2440\/revisions\/2451"}],"wp:attachment":[{"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/media?parent=2440"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/categories?post=2440"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/optimumsportsperformance.com\/blog\/wp-json\/wp\/v2\/tags?post=2440"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}