Intro

In Part 1 we laid the ground work for coding in STAN and building a simple linear regression model. In this tutorial, we extend our model to include multiple independent variables and then cover some model checking strategies to ensure that the model is behaving as we intend it to.

We begin by loading the necessary packages and the palmerpenguins data set.

suppressPackageStartupMessages({
  suppressWarnings({
    suppressMessages({
      library(tidyverse)
      library(palmerpenguins)
      library(rstan)
      library(ggridges)
      library(bayesplot)
    })
  })
})


dat <- penguins %>%
  drop_na()

dat %>%
  head()

## # A tibble: 6 × 8
##   species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
##   <fct>   <fct>              <dbl>         <dbl>             <int>       <int>
## 1 Adelie  Torgersen           39.1          18.7               181        3750
## 2 Adelie  Torgersen           39.5          17.4               186        3800
## 3 Adelie  Torgersen           40.3          18                 195        3250
## 4 Adelie  Torgersen           36.7          19.3               193        3450
## 5 Adelie  Torgersen           39.3          20.6               190        3650
## 6 Adelie  Torgersen           38.9          17.8               181        3625
## # ℹ 2 more variables: sex <fct>, year <int>

Setting up our list of data

Recall that STAN requires the data to be in a list format. For this tutorial, we are going to estimate bill_length from flipper_length and body_mass. (In a future tutorial we will discuss handling categorical variables in STAN).

# Store data for the model in a list
data_list <- list(
  N = nrow(dat),
  bill = dat$bill_length_mm,
  flipper = dat$flipper_length_mm,
  body_mass = dat$body_mass_g
)

# Look at our data list
data_list

## $N
## [1] 333
## 
## $bill
##   [1] 39.1 39.5 40.3 36.7 39.3 38.9 39.2 41.1 38.6 34.6 36.6 38.7 42.5 34.4 46.0
##  [16] 37.8 37.7 35.9 38.2 38.8 35.3 40.6 40.5 37.9 40.5 39.5 37.2 39.5 40.9 36.4
##  [31] 39.2 38.8 42.2 37.6 39.8 36.5 40.8 36.0 44.1 37.0 39.6 41.1 36.0 42.3 39.6
##  [46] 40.1 35.0 42.0 34.5 41.4 39.0 40.6 36.5 37.6 35.7 41.3 37.6 41.1 36.4 41.6
##  [61] 35.5 41.1 35.9 41.8 33.5 39.7 39.6 45.8 35.5 42.8 40.9 37.2 36.2 42.1 34.6
##  [76] 42.9 36.7 35.1 37.3 41.3 36.3 36.9 38.3 38.9 35.7 41.1 34.0 39.6 36.2 40.8
##  [91] 38.1 40.3 33.1 43.2 35.0 41.0 37.7 37.8 37.9 39.7 38.6 38.2 38.1 43.2 38.1
## [106] 45.6 39.7 42.2 39.6 42.7 38.6 37.3 35.7 41.1 36.2 37.7 40.2 41.4 35.2 40.6
## [121] 38.8 41.5 39.0 44.1 38.5 43.1 36.8 37.5 38.1 41.1 35.6 40.2 37.0 39.7 40.2
## [136] 40.6 32.1 40.7 37.3 39.0 39.2 36.6 36.0 37.8 36.0 41.5 46.1 50.0 48.7 50.0
## [151] 47.6 46.5 45.4 46.7 43.3 46.8 40.9 49.0 45.5 48.4 45.8 49.3 42.0 49.2 46.2
## [166] 48.7 50.2 45.1 46.5 46.3 42.9 46.1 47.8 48.2 50.0 47.3 42.8 45.1 59.6 49.1
## [181] 48.4 42.6 44.4 44.0 48.7 42.7 49.6 45.3 49.6 50.5 43.6 45.5 50.5 44.9 45.2
## [196] 46.6 48.5 45.1 50.1 46.5 45.0 43.8 45.5 43.2 50.4 45.3 46.2 45.7 54.3 45.8
## [211] 49.8 49.5 43.5 50.7 47.7 46.4 48.2 46.5 46.4 48.6 47.5 51.1 45.2 45.2 49.1
## [226] 52.5 47.4 50.0 44.9 50.8 43.4 51.3 47.5 52.1 47.5 52.2 45.5 49.5 44.5 50.8
## [241] 49.4 46.9 48.4 51.1 48.5 55.9 47.2 49.1 46.8 41.7 53.4 43.3 48.1 50.5 49.8
## [256] 43.5 51.5 46.2 55.1 48.8 47.2 46.8 50.4 45.2 49.9 46.5 50.0 51.3 45.4 52.7
## [271] 45.2 46.1 51.3 46.0 51.3 46.6 51.7 47.0 52.0 45.9 50.5 50.3 58.0 46.4 49.2
## [286] 42.4 48.5 43.2 50.6 46.7 52.0 50.5 49.5 46.4 52.8 40.9 54.2 42.5 51.0 49.7
## [301] 47.5 47.6 52.0 46.9 53.5 49.0 46.2 50.9 45.5 50.9 50.8 50.1 49.0 51.5 49.8
## [316] 48.1 51.4 45.7 50.7 42.5 52.2 45.2 49.3 50.2 45.6 51.9 46.8 45.7 55.8 43.5
## [331] 49.6 50.8 50.2
## 
## $flipper
##   [1] 181 186 195 193 190 181 195 182 191 198 185 195 197 184 194 174 180 189
##  [19] 185 180 187 183 187 172 180 178 178 188 184 195 196 190 180 181 184 182
##  [37] 195 186 196 185 190 182 190 191 186 188 190 200 187 191 186 193 181 194
##  [55] 185 195 185 192 184 192 195 188 190 198 190 190 196 197 190 195 191 184
##  [73] 187 195 189 196 187 193 191 194 190 189 189 190 202 205 185 186 187 208
##  [91] 190 196 178 192 192 203 183 190 193 184 199 190 181 197 198 191 193 197
## [109] 191 196 188 199 189 189 187 198 176 202 186 199 191 195 191 210 190 197
## [127] 193 199 187 190 191 200 185 193 193 187 188 190 192 185 190 184 195 193
## [145] 187 201 211 230 210 218 215 210 211 219 209 215 214 216 214 213 210 217
## [163] 210 221 209 222 218 215 213 215 215 215 215 210 220 222 209 207 230 220
## [181] 220 213 219 208 208 208 225 210 216 222 217 210 225 213 215 210 220 210
## [199] 225 217 220 208 220 208 224 208 221 214 231 219 230 229 220 223 216 221
## [217] 221 217 216 230 209 220 215 223 212 221 212 224 212 228 218 218 212 230
## [235] 218 228 212 224 214 226 216 222 203 225 219 228 215 228 215 210 219 208
## [253] 209 216 229 213 230 217 230 222 214 215 222 212 213 192 196 193 188 197
## [271] 198 178 197 195 198 193 194 185 201 190 201 197 181 190 195 181 191 187
## [289] 193 195 197 200 200 191 205 187 201 187 203 195 199 195 210 192 205 210
## [307] 187 196 196 196 201 190 212 187 198 199 201 193 203 187 197 191 203 202
## [325] 194 206 189 195 207 202 193 210 198
## 
## $body_mass
##   [1] 3750 3800 3250 3450 3650 3625 4675 3200 3800 4400 3700 3450 4500 3325 4200
##  [16] 3400 3600 3800 3950 3800 3800 3550 3200 3150 3950 3250 3900 3300 3900 3325
##  [31] 4150 3950 3550 3300 4650 3150 3900 3100 4400 3000 4600 3425 3450 4150 3500
##  [46] 4300 3450 4050 2900 3700 3550 3800 2850 3750 3150 4400 3600 4050 2850 3950
##  [61] 3350 4100 3050 4450 3600 3900 3550 4150 3700 4250 3700 3900 3550 4000 3200
##  [76] 4700 3800 4200 3350 3550 3800 3500 3950 3600 3550 4300 3400 4450 3300 4300
##  [91] 3700 4350 2900 4100 3725 4725 3075 4250 2925 3550 3750 3900 3175 4775 3825
## [106] 4600 3200 4275 3900 4075 2900 3775 3350 3325 3150 3500 3450 3875 3050 4000
## [121] 3275 4300 3050 4000 3325 3500 3500 4475 3425 3900 3175 3975 3400 4250 3400
## [136] 3475 3050 3725 3000 3650 4250 3475 3450 3750 3700 4000 4500 5700 4450 5700
## [151] 5400 4550 4800 5200 4400 5150 4650 5550 4650 5850 4200 5850 4150 6300 4800
## [166] 5350 5700 5000 4400 5050 5000 5100 5650 4600 5550 5250 4700 5050 6050 5150
## [181] 5400 4950 5250 4350 5350 3950 5700 4300 4750 5550 4900 4200 5400 5100 5300
## [196] 4850 5300 4400 5000 4900 5050 4300 5000 4450 5550 4200 5300 4400 5650 4700
## [211] 5700 5800 4700 5550 4750 5000 5100 5200 4700 5800 4600 6000 4750 5950 4625
## [226] 5450 4725 5350 4750 5600 4600 5300 4875 5550 4950 5400 4750 5650 4850 5200
## [241] 4925 4875 4625 5250 4850 5600 4975 5500 5500 4700 5500 4575 5500 5000 5950
## [256] 4650 5500 4375 5850 6000 4925 4850 5750 5200 5400 3500 3900 3650 3525 3725
## [271] 3950 3250 3750 4150 3700 3800 3775 3700 4050 3575 4050 3300 3700 3450 4400
## [286] 3600 3400 2900 3800 3300 4150 3400 3800 3700 4550 3200 4300 3350 4100 3600
## [301] 3900 3850 4800 2700 4500 3950 3650 3550 3500 3675 4450 3400 4300 3250 3675
## [316] 3325 3950 3600 4050 3350 3450 3250 4050 3800 3525 3950 3650 3650 4000 3400
## [331] 3775 4100 3775

Write the STAN model

Again, we will use rstan to code our model directly into the below code chunk using quotations. In future tutorials we will discuss how to write STAN code in in its own STAN File and call it from R and we will also use the {cmdstanr} package.

Recall that there are three non-negotiable chunks of STAN code that always need to be filled in: data{}, parameters{}, and model{}. In future blogs we will talk about use cases for the other STAN code chunks.

# Write the Stan Model
stan_code <- "
// Define the data
data{
  int<lower=0> N;        // Total number of observations
  vector[N] bill;    // Outcome Variable = Bill Length
  vector[N] flipper;  // Independent Variable = Flipper Length
  vector[N] body_mass;  // Independent Variable = Body Mass
}

// Specify the parameters and their data types (these will all be real numbers)
parameters {
  real alpha;              // Intercept
  real beta_flipper;       // coefficient for flipper length
  real beta_body_mass;     // coefficient for body mass
  real<lower=0> sigma;     // Standard deviation of the error term
}

// Specify the model and priors (NOTE: if no priors are specified STAN will use weakly informative priors by default)
model {

  // Priors
  alpha ~ normal(0, 10);   // Weakly informative prior for the intercept
  beta_flipper ~ normal(0, 1);    // Weakly informative prior for flipper length
  beta_body_mass ~ normal(0, 1);    // Weakly informative prior for body mass
  sigma ~ exponential(1);  // Weakly informative prior for the standard deviation

  // Likelihood
  bill ~ normal(alpha + beta_flipper * flipper + beta_body_mass * body_mass, sigma);
}"

Fit the STAN model

fit_stan <- stan(model_code = stan_code,
            data = data_list,
            iter = 2000,
            warmup = 500,
            chains = 4,
            seed = 1234)

## Trying to compile a simple C file

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## using C compiler: ‘Apple clang version 16.0.0 (clang-1600.0.26.6)’
## using SDK: ‘MacOSX15.2.sdk’
## clang -arch x86_64 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -D_HAS_AUTO_PTR_ETC=0  -include '/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/opt/R/x86_64/include    -fPIC  -falign-functions=64 -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/Eigen/Core:19:
## /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: 'cmath' file not found
##   679 | #include <cmath>
##       |          ^~~~~~~
## 1 error generated.
## make: *** [foo.o] Error 1
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000122 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.22 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  501 / 2000 [ 25%]  (Sampling)
## Chain 1: Iteration:  700 / 2000 [ 35%]  (Sampling)
## Chain 1: Iteration:  900 / 2000 [ 45%]  (Sampling)
## Chain 1: Iteration: 1100 / 2000 [ 55%]  (Sampling)
## Chain 1: Iteration: 1300 / 2000 [ 65%]  (Sampling)
## Chain 1: Iteration: 1500 / 2000 [ 75%]  (Sampling)
## Chain 1: Iteration: 1700 / 2000 [ 85%]  (Sampling)
## Chain 1: Iteration: 1900 / 2000 [ 95%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 5.329 seconds (Warm-up)
## Chain 1:                6.68 seconds (Sampling)
## Chain 1:                12.009 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.26 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  501 / 2000 [ 25%]  (Sampling)
## Chain 2: Iteration:  700 / 2000 [ 35%]  (Sampling)
## Chain 2: Iteration:  900 / 2000 [ 45%]  (Sampling)
## Chain 2: Iteration: 1100 / 2000 [ 55%]  (Sampling)
## Chain 2: Iteration: 1300 / 2000 [ 65%]  (Sampling)
## Chain 2: Iteration: 1500 / 2000 [ 75%]  (Sampling)
## Chain 2: Iteration: 1700 / 2000 [ 85%]  (Sampling)
## Chain 2: Iteration: 1900 / 2000 [ 95%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 3.129 seconds (Warm-up)
## Chain 2:                19.961 seconds (Sampling)
## Chain 2:                23.09 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2.4e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  501 / 2000 [ 25%]  (Sampling)
## Chain 3: Iteration:  700 / 2000 [ 35%]  (Sampling)
## Chain 3: Iteration:  900 / 2000 [ 45%]  (Sampling)
## Chain 3: Iteration: 1100 / 2000 [ 55%]  (Sampling)
## Chain 3: Iteration: 1300 / 2000 [ 65%]  (Sampling)
## Chain 3: Iteration: 1500 / 2000 [ 75%]  (Sampling)
## Chain 3: Iteration: 1700 / 2000 [ 85%]  (Sampling)
## Chain 3: Iteration: 1900 / 2000 [ 95%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 3.461 seconds (Warm-up)
## Chain 3:                14.954 seconds (Sampling)
## Chain 3:                18.415 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2.7e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.27 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  501 / 2000 [ 25%]  (Sampling)
## Chain 4: Iteration:  700 / 2000 [ 35%]  (Sampling)
## Chain 4: Iteration:  900 / 2000 [ 45%]  (Sampling)
## Chain 4: Iteration: 1100 / 2000 [ 55%]  (Sampling)
## Chain 4: Iteration: 1300 / 2000 [ 65%]  (Sampling)
## Chain 4: Iteration: 1500 / 2000 [ 75%]  (Sampling)
## Chain 4: Iteration: 1700 / 2000 [ 85%]  (Sampling)
## Chain 4: Iteration: 1900 / 2000 [ 95%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 3.363 seconds (Warm-up)
## Chain 4:                19.351 seconds (Sampling)
## Chain 4:                22.714 seconds (Total)
## Chain 4:

## Warning: There were 1091 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
## https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded

## Warning: Examine the pairs() plot to diagnose sampling problems

## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#bulk-ess

Investingate the model output

print(fit_stan, digits = 6)

## Inference for Stan model: anon_model.
## 4 chains, each with iter=2000; warmup=500; thin=1; 
## post-warmup draws per chain=1500, total post-warmup draws=6000.
## 
##                       mean  se_mean       sd        2.5%         25%
## alpha            -3.679465 0.211551 4.079075  -11.626258   -6.293551
## beta_flipper      0.225014 0.001448 0.028870    0.168011    0.206168
## beta_body_mass    0.000582 0.000019 0.000530   -0.000455    0.000235
## sigma             4.135915 0.002825 0.159831    3.838658    4.025622
## lp__           -643.515875 0.043541 1.471371 -647.301479 -644.198630
##                        50%         75%       97.5% n_eff     Rhat
## alpha            -3.783311   -1.073551    4.485820   372 1.007025
## beta_flipper      0.225060    0.243705    0.281568   398 1.006641
## beta_body_mass    0.000592    0.000940    0.001588   784 1.003339
## sigma             4.129933    4.239378    4.466283  3202 0.999764
## lp__           -643.168168 -642.443697 -641.772455  1142 1.003372
## 
## Samples were drawn using NUTS(diag_e) at Sat Mar 21 22:12:50 2026.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

extract and plot posterior samples for independent variables

post_coef <- extract(fit_stan,
        pars = c("beta_flipper", "beta_body_mass")) %>%
  as.data.frame()

par(mfrow = c(1, 2))
hist(post_coef$beta_flipper,
     main = "Posterior Flipper\nDistribution",
     xlab = "Flipper Length")
hist(post_coef$beta_body_mass,
     main = "Posterior Body Mas\nDistribution",
     xlab = "Body Mass")

quantile(post_coef$beta_flipper, probs = c(0.05, 0.25, 0.5, 0.75, 0.95))

##        5%       25%       50%       75%       95% 
## 0.1784509 0.2061678 0.2250599 0.2437050 0.2714402

quantile(post_coef$beta_body_mass, probs = c(0.05, 0.25, 0.5, 0.75, 0.95))

##            5%           25%           50%           75%           95% 
## -0.0002904450  0.0002351598  0.0005917957  0.0009399863  0.0014290653

Get the posterior draw for all parameters

posterior_draws <- as.data.frame(fit_stan)
head(posterior_draws)

##       alpha beta_flipper beta_body_mass    sigma      lp__
## 1  5.005438    0.1438093   2.434156e-03 4.224299 -647.3597
## 2  7.115984    0.1569045   1.338426e-03 4.140859 -645.2532
## 3  8.572489    0.1587078   9.008053e-04 4.127159 -647.6632
## 4  2.453212    0.1941319   6.167439e-04 3.866023 -644.8068
## 5  1.487107    0.1942291   8.083443e-04 3.911150 -643.3578
## 6 -3.969925    0.2354491   9.044432e-05 4.184511 -643.0434

colMeans(posterior_draws)

##          alpha   beta_flipper beta_body_mass          sigma           lp__ 
##  -3.679465e+00   2.250136e-01   5.819785e-04   4.135915e+00  -6.435159e+02

apply(X = posterior_draws, MARGIN = 2, FUN = sd)

##          alpha   beta_flipper beta_body_mass          sigma           lp__ 
##   4.0790753649   0.0288704944   0.0005300588   0.1598307057   1.4713711703

Writing the model with a vector of several beta coefficients

Similar to R, STAN has multiple ways to do the same thing. For example, above we wrote our STAN code by explicitly writing out each variable’s coefficient in the parameters{} block. While this isn’t a big deal if we only have 2 independent variables, it becomes tedious if we have a larger number of independent variables. Additionally, we can simplify some of the code space by specifying a vector of betas and writing the model from there. The below example shows how to specify multiple beta coefficients in the parameters{} block.

# Write the Stan Model
stan_code2 <- "
// Define the data
data{
  int<lower=0> N;        // Total number of observations
  vector[N] bill;    // Outcome Variable = Bill Length
  vector[N] flipper;  // Independent Variable = Flipper Length
  vector[N] body_mass;  // Independent Variable = Body Mass
}

// Specify the parameters and their data types (these will all be real numbers)
parameters {
  vector[3] beta;  // We need 3 beta-coefficients (Intercept + Flipper Length + Body Mass) so we create 3 vectors, one for each
  real<lower=0> sigma;     // Standard deviation of the error term
}

// Specify the model and priors (NOTE: if no priors are specified STAN will use weakly informative priors by default)
model {

  // Priors -- Apply these to the correct element in the beta vecetor
  beta[1] ~ normal(0, 10);   // Weakly informative prior for the intercept
  beta[2] ~ normal(0, 1);    // Weakly informative prior for flipper length
  beta[3] ~ normal(0, 1);    // Weakly informative prior for body mass
  sigma ~ exponential(1);  // Weakly informative prior for the standard deviation

  // Likelihood -- Simply call the correct number from the specified beta vector
  bill ~ normal(beta[1] + beta[2] * flipper + beta[3] * body_mass, sigma);
}"

Fit the model

fit_stan2 <- stan(model_code = stan_code2,
            data = data_list,
            iter = 2000,
            warmup = 500,
            chains = 4,
            seed = 1234)

## Trying to compile a simple C file

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## using C compiler: ‘Apple clang version 16.0.0 (clang-1600.0.26.6)’
## using SDK: ‘MacOSX15.2.sdk’
## clang -arch x86_64 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -D_HAS_AUTO_PTR_ETC=0  -include '/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/opt/R/x86_64/include    -fPIC  -falign-functions=64 -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/Eigen/Core:19:
## /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: 'cmath' file not found
##   679 | #include <cmath>
##       |          ^~~~~~~
## 1 error generated.
## make: *** [foo.o] Error 1
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000164 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.64 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  501 / 2000 [ 25%]  (Sampling)
## Chain 1: Iteration:  700 / 2000 [ 35%]  (Sampling)
## Chain 1: Iteration:  900 / 2000 [ 45%]  (Sampling)
## Chain 1: Iteration: 1100 / 2000 [ 55%]  (Sampling)
## Chain 1: Iteration: 1300 / 2000 [ 65%]  (Sampling)
## Chain 1: Iteration: 1500 / 2000 [ 75%]  (Sampling)
## Chain 1: Iteration: 1700 / 2000 [ 85%]  (Sampling)
## Chain 1: Iteration: 1900 / 2000 [ 95%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 5.541 seconds (Warm-up)
## Chain 1:                6.879 seconds (Sampling)
## Chain 1:                12.42 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.9e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.29 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  501 / 2000 [ 25%]  (Sampling)
## Chain 2: Iteration:  700 / 2000 [ 35%]  (Sampling)
## Chain 2: Iteration:  900 / 2000 [ 45%]  (Sampling)
## Chain 2: Iteration: 1100 / 2000 [ 55%]  (Sampling)
## Chain 2: Iteration: 1300 / 2000 [ 65%]  (Sampling)
## Chain 2: Iteration: 1500 / 2000 [ 75%]  (Sampling)
## Chain 2: Iteration: 1700 / 2000 [ 85%]  (Sampling)
## Chain 2: Iteration: 1900 / 2000 [ 95%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 3.132 seconds (Warm-up)
## Chain 2:                21.664 seconds (Sampling)
## Chain 2:                24.796 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2.6e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.26 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  501 / 2000 [ 25%]  (Sampling)
## Chain 3: Iteration:  700 / 2000 [ 35%]  (Sampling)
## Chain 3: Iteration:  900 / 2000 [ 45%]  (Sampling)
## Chain 3: Iteration: 1100 / 2000 [ 55%]  (Sampling)
## Chain 3: Iteration: 1300 / 2000 [ 65%]  (Sampling)
## Chain 3: Iteration: 1500 / 2000 [ 75%]  (Sampling)
## Chain 3: Iteration: 1700 / 2000 [ 85%]  (Sampling)
## Chain 3: Iteration: 1900 / 2000 [ 95%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 3.422 seconds (Warm-up)
## Chain 3:                16.794 seconds (Sampling)
## Chain 3:                20.216 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2.4e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  501 / 2000 [ 25%]  (Sampling)
## Chain 4: Iteration:  700 / 2000 [ 35%]  (Sampling)
## Chain 4: Iteration:  900 / 2000 [ 45%]  (Sampling)
## Chain 4: Iteration: 1100 / 2000 [ 55%]  (Sampling)
## Chain 4: Iteration: 1300 / 2000 [ 65%]  (Sampling)
## Chain 4: Iteration: 1500 / 2000 [ 75%]  (Sampling)
## Chain 4: Iteration: 1700 / 2000 [ 85%]  (Sampling)
## Chain 4: Iteration: 1900 / 2000 [ 95%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 4.626 seconds (Warm-up)
## Chain 4:                25.065 seconds (Sampling)
## Chain 4:                29.691 seconds (Total)
## Chain 4:

## Warning: There were 1091 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
## https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded

## Warning: Examine the pairs() plot to diagnose sampling problems

## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#bulk-ess

Model Output

print(fit_stan2, digits = 6)

## Inference for Stan model: anon_model.
## 4 chains, each with iter=2000; warmup=500; thin=1; 
## post-warmup draws per chain=1500, total post-warmup draws=6000.
## 
##                mean  se_mean       sd        2.5%         25%         50%
## beta[1]   -3.679465 0.211551 4.079075  -11.626258   -6.293551   -3.783311
## beta[2]    0.225014 0.001448 0.028870    0.168011    0.206168    0.225060
## beta[3]    0.000582 0.000019 0.000530   -0.000455    0.000235    0.000592
## sigma      4.135915 0.002825 0.159831    3.838658    4.025622    4.129933
## lp__    -643.515875 0.043541 1.471371 -647.301479 -644.198630 -643.168168
##                 75%       97.5% n_eff     Rhat
## beta[1]   -1.073551    4.485820   372 1.007025
## beta[2]    0.243705    0.281568   398 1.006641
## beta[3]    0.000940    0.001588   784 1.003339
## sigma      4.239378    4.466283  3202 0.999764
## lp__    -642.443697 -641.772455  1142 1.003372
## 
## Samples were drawn using NUTS(diag_e) at Sat Mar 21 22:15:39 2026.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

You’ll find we obtain the same model with less code!

Understanding the Model Fit

Once we fit our model it’s important to go through about check to ensure that the fit of the model to the data is what we’d expect it to be and that there are no errors in our model or issues with the Markov Chain that would cause the model to something we wouldn’t want to trust when making inference.

In a previous section above we printed the model summary and the extracted the posterior parameters of the model, plotted histograms of the posterior distributions, and calculated the mean and standard deviation of the posterior distributions (which returns the value of the coefficient’s mean and sd that we obtain when we printed the model summary). In addition, we want to check model diagnostics and convergence.

In Part 1, we plotted Trace Plots and Density Plots, which are helpful visual diagnostics that tell us how well the chains of our Markov Chains.

The Trace Plot should show randomness between the independent chains we specified when we fit the model. Some have said that the Trace Plots should look like “fuzzy caterpillars” if Markov Chain is properly exploring the range of data and not getting stuck in one area. The density plots of each chain should also look relatively similar. Finally, we add a plot of autocorrelation, which should drop off quickly if the chains are mixing well. If this plot drops off slowly, indicating high autocorrelation, it means that the consecutive samples are more independent, leading to a smaller effective sample size, and parameter estimates that are less reliable.

Trace Plots of the MCMC

mcmc_trace(as.array(fit_stan), pars = c("alpha", "beta_flipper", "beta_body_mass", "sigma"))

Density Plots of the Model Parameters

mcmc_dens_overlay(as.array(fit_stan), pars = c("alpha", "beta_flipper", "beta_body_mass", "sigma"))

Autocorrelation Plot

mcmc_acf(as.array(fit_stan), pars = c("alpha", "beta_flipper", "beta_body_mass", "sigma"))

Finally, we can use the model diagnostic values in the model output to help us understand how well the model is performing.

print(fit_stan, digits = 6)

## Inference for Stan model: anon_model.
## 4 chains, each with iter=2000; warmup=500; thin=1; 
## post-warmup draws per chain=1500, total post-warmup draws=6000.
## 
##                       mean  se_mean       sd        2.5%         25%
## alpha            -3.679465 0.211551 4.079075  -11.626258   -6.293551
## beta_flipper      0.225014 0.001448 0.028870    0.168011    0.206168
## beta_body_mass    0.000582 0.000019 0.000530   -0.000455    0.000235
## sigma             4.135915 0.002825 0.159831    3.838658    4.025622
## lp__           -643.515875 0.043541 1.471371 -647.301479 -644.198630
##                        50%         75%       97.5% n_eff     Rhat
## alpha            -3.783311   -1.073551    4.485820   372 1.007025
## beta_flipper      0.225060    0.243705    0.281568   398 1.006641
## beta_body_mass    0.000592    0.000940    0.001588   784 1.003339
## sigma             4.129933    4.239378    4.466283  3202 0.999764
## lp__           -643.168168 -642.443697 -641.772455  1142 1.003372
## 
## Samples were drawn using NUTS(diag_e) at Sat Mar 21 22:12:50 2026.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

The Rhat value should be near 1 for each of the parameters, indicating that the simulation was stable.
The n_eff number is the effective sample size number, indicating how many samples were required to have an effective simulation. We can calculate further extract the neff_ratio, which we see for each model parameter below. We can use this value to calculate how many samples from the Markov Chain are effective independent samples. For example, when we fit the model we specified 4 chains, each with 2000 iterations, and we discard 500 of the iterations as warm up. This leaves us with a sample of 4 * (2000 - 150) = 7400. Looking at the neff_ratio for beta_body_mass, this indicates that our 7400 Markov Chain values are as effective as about 969 (0.131*7400 = 969.4) independent samples. Of course, we could run this model for more iterations and a longer warm up to help improve our effective sample sizes and improve performance. For example, notice that the effective sample sizes for alpha (the model intercept) and beta_flipper are rather small. Recall from the Autocorrelation plot that there were times where the drop off between lags in the chain was slow. Thus, running this model with more iterations may help us improve the issue with neff. In future tutorials we will take a look at some poor model convergence and discuss how to improve it.

## Rhat values
rhat(fit_stan)

##          alpha   beta_flipper beta_body_mass          sigma           lp__ 
##      1.0070253      1.0066412      1.0033386      0.9997644      1.0033717

## neff_ratio
neff_ratio(fit_stan)

##          alpha   beta_flipper beta_body_mass          sigma           lp__ 
##     0.06196409     0.06629138     0.13061585     0.53360784     0.19032685

Finally, we can use other functions from the bayesplot package to produce visuals of different model parameters along with their Credible Intervals.

mcmc_areas(fit_stan, pars = "beta_flipper")

Wrapping Up

In this tutorial we covered two different ways of specifying the same multiple regression model in STAN and talked through some ways to quickly check model convergence and fit, both visually and numerically.

STAN Part 2 - Multiple Linear Regression & Model Checking

Patrick Ward

2026-03-22