STAN is a probabilistic programming language that is
written in the C++ language. If you don’t know C++, that’s okay! People
often will write their STAN models in a different language and then use
code to complile the model in C++ and run in STAN. For
example, we can run it from R using {rstan} or {cmdstanr} or we can
write the code in Python using {cmdstanpy}.
In prior blog articles, we built Bayesian models using packages such
as {rstanarm} and {brms}, which both run STAN code under
the hood, but provide us with an easy to use syntax that is familiar in
the R language. The analog to {rstanarm} and {brms} in Python is
{bambi}.
So, why bother learning to code STAN models when we have
these conveniet packages that already take the syntax we are used to and
that we know and love? Well, sometimes with these packages we end up
hitting a wall. The model we require is more complex than what is
available to be run in these packages. Additionally, writing out the
full model and specifying it, as we need to do when we code
STAN, helps us really learn what is there, how the
parameters are influenced by priors, and offers us a lot of flexibility
to adjust things as we see fit.
There are three non-negotiable blocks of code that are
required to run a STAN model. There are additional blocks
that we will add on later, but there are three blocks that always need
to be there:
data{}
;parameters{}
real for a real numbermodel{}
STAN will use weakly
informative priors by defaultLoad the palmerpenguins data
suppressPackageStartupMessages({
suppressWarnings({
suppressMessages({
library(tidyverse)
library(palmerpenguins)
library(rstan)
library(ggridges)
library(bayesplot)
})
})
})
dat <- penguins %>%
drop_na()
dat %>%
head()
## # A tibble: 6 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen 36.7 19.3 193 3450
## 5 Adelie Torgersen 39.3 20.6 190 3650
## 6 Adelie Torgersen 38.9 17.8 181 3625
## # ℹ 2 more variables: sex <fct>, year <int>
Store the data as a list for STAN
bill_length_mm, from Flipper Length,
flipper_length_mm# Store data for the model in a list
data_list <- list(
N = nrow(dat),
bill = dat$bill_length_mm,
flipper = dat$flipper_length_mm
)
# Look at our data list
data_list
## $N
## [1] 333
##
## $bill
## [1] 39.1 39.5 40.3 36.7 39.3 38.9 39.2 41.1 38.6 34.6 36.6 38.7 42.5 34.4 46.0
## [16] 37.8 37.7 35.9 38.2 38.8 35.3 40.6 40.5 37.9 40.5 39.5 37.2 39.5 40.9 36.4
## [31] 39.2 38.8 42.2 37.6 39.8 36.5 40.8 36.0 44.1 37.0 39.6 41.1 36.0 42.3 39.6
## [46] 40.1 35.0 42.0 34.5 41.4 39.0 40.6 36.5 37.6 35.7 41.3 37.6 41.1 36.4 41.6
## [61] 35.5 41.1 35.9 41.8 33.5 39.7 39.6 45.8 35.5 42.8 40.9 37.2 36.2 42.1 34.6
## [76] 42.9 36.7 35.1 37.3 41.3 36.3 36.9 38.3 38.9 35.7 41.1 34.0 39.6 36.2 40.8
## [91] 38.1 40.3 33.1 43.2 35.0 41.0 37.7 37.8 37.9 39.7 38.6 38.2 38.1 43.2 38.1
## [106] 45.6 39.7 42.2 39.6 42.7 38.6 37.3 35.7 41.1 36.2 37.7 40.2 41.4 35.2 40.6
## [121] 38.8 41.5 39.0 44.1 38.5 43.1 36.8 37.5 38.1 41.1 35.6 40.2 37.0 39.7 40.2
## [136] 40.6 32.1 40.7 37.3 39.0 39.2 36.6 36.0 37.8 36.0 41.5 46.1 50.0 48.7 50.0
## [151] 47.6 46.5 45.4 46.7 43.3 46.8 40.9 49.0 45.5 48.4 45.8 49.3 42.0 49.2 46.2
## [166] 48.7 50.2 45.1 46.5 46.3 42.9 46.1 47.8 48.2 50.0 47.3 42.8 45.1 59.6 49.1
## [181] 48.4 42.6 44.4 44.0 48.7 42.7 49.6 45.3 49.6 50.5 43.6 45.5 50.5 44.9 45.2
## [196] 46.6 48.5 45.1 50.1 46.5 45.0 43.8 45.5 43.2 50.4 45.3 46.2 45.7 54.3 45.8
## [211] 49.8 49.5 43.5 50.7 47.7 46.4 48.2 46.5 46.4 48.6 47.5 51.1 45.2 45.2 49.1
## [226] 52.5 47.4 50.0 44.9 50.8 43.4 51.3 47.5 52.1 47.5 52.2 45.5 49.5 44.5 50.8
## [241] 49.4 46.9 48.4 51.1 48.5 55.9 47.2 49.1 46.8 41.7 53.4 43.3 48.1 50.5 49.8
## [256] 43.5 51.5 46.2 55.1 48.8 47.2 46.8 50.4 45.2 49.9 46.5 50.0 51.3 45.4 52.7
## [271] 45.2 46.1 51.3 46.0 51.3 46.6 51.7 47.0 52.0 45.9 50.5 50.3 58.0 46.4 49.2
## [286] 42.4 48.5 43.2 50.6 46.7 52.0 50.5 49.5 46.4 52.8 40.9 54.2 42.5 51.0 49.7
## [301] 47.5 47.6 52.0 46.9 53.5 49.0 46.2 50.9 45.5 50.9 50.8 50.1 49.0 51.5 49.8
## [316] 48.1 51.4 45.7 50.7 42.5 52.2 45.2 49.3 50.2 45.6 51.9 46.8 45.7 55.8 43.5
## [331] 49.6 50.8 50.2
##
## $flipper
## [1] 181 186 195 193 190 181 195 182 191 198 185 195 197 184 194 174 180 189
## [19] 185 180 187 183 187 172 180 178 178 188 184 195 196 190 180 181 184 182
## [37] 195 186 196 185 190 182 190 191 186 188 190 200 187 191 186 193 181 194
## [55] 185 195 185 192 184 192 195 188 190 198 190 190 196 197 190 195 191 184
## [73] 187 195 189 196 187 193 191 194 190 189 189 190 202 205 185 186 187 208
## [91] 190 196 178 192 192 203 183 190 193 184 199 190 181 197 198 191 193 197
## [109] 191 196 188 199 189 189 187 198 176 202 186 199 191 195 191 210 190 197
## [127] 193 199 187 190 191 200 185 193 193 187 188 190 192 185 190 184 195 193
## [145] 187 201 211 230 210 218 215 210 211 219 209 215 214 216 214 213 210 217
## [163] 210 221 209 222 218 215 213 215 215 215 215 210 220 222 209 207 230 220
## [181] 220 213 219 208 208 208 225 210 216 222 217 210 225 213 215 210 220 210
## [199] 225 217 220 208 220 208 224 208 221 214 231 219 230 229 220 223 216 221
## [217] 221 217 216 230 209 220 215 223 212 221 212 224 212 228 218 218 212 230
## [235] 218 228 212 224 214 226 216 222 203 225 219 228 215 228 215 210 219 208
## [253] 209 216 229 213 230 217 230 222 214 215 222 212 213 192 196 193 188 197
## [271] 198 178 197 195 198 193 194 185 201 190 201 197 181 190 195 181 191 187
## [289] 193 195 197 200 200 191 205 187 201 187 203 195 199 195 210 192 205 210
## [307] 187 196 196 196 201 190 212 187 198 199 201 193 203 187 197 191 203 202
## [325] 194 206 189 195 207 202 193 210 198
Write the STAN model
Again, we need the mandatory data{},
parameters{}. and model{} blocks to make
STAN go. We will talk about the other blocks that can be
filled out in future blog articles.
Remember, each line needs to end with a semicolon,
;
In R we use # for comments. In
STAN we use two slashes, // to type our text
comments and notes
Instantiation of the STAN model can be done a few
ways:
R and place quotes around
itSTAN file that we then call
using {rstan}STAN file that we then call
using {rstan}STAN file that we then call
using {cmdstanr}STAN file that we then call
using {cmdstanpy}For the purposes of today, I’ll run everything directly in the code
chunk below so that we can see how things work. This is usually the way
I tend to work. However, there are times where storing a separate file
can be useful. In particular, the STAN model in its own
STAN File can be used by R or Python, making your model flexible for
different coding languages. So, in future blogs I will cover how to set
up a STAN File and then call it with {rstan}, {cmdstanr}, and
{cmdstanpy} (I’ll make a few Jupyer Notebooks to show how this works in
Python).
# Write the Stan Model
stan_code <- "
// Define the data
data{
int<lower=0> N; // Total number of observations
vector[N] bill; // Outcome Variable = Bill Length
vector[N] flipper; // Independent Variable = Flipper Length
}
// Specify the parameters and their data types (these will all be real numbers)
parameters {
real alpha; // Intercept
real beta; // Slope
real<lower=0> sigma; // Standard deviation of the error term
}
// Specify the model and priors (NOTE: if no priors are specified STAN will use weakly informative priors by default)
model {
// Priors
alpha ~ normal(0, 10); // Weakly informative prior for the intercept
beta ~ normal(0, 1); // Weakly informative prior for the slope
sigma ~ exponential(1); // Weakly informative prior for the standard deviation
// Likelihood
bill ~ normal(alpha + beta * flipper, sigma);
}"
mu is
the deterministic part of the linear model, \(y = a + b*x\), and sigma
represents the stochastic part of the model (the model error)There are several arguments in the stan() function. I’ll
talk about them more in future posts as we discuss what adjusting them
does. The arguments will look familiar to anyone that has fit
STAN models using {rstanarm} or {brms}
fit_stan <- stan(model_code = stan_code,
data = data_list,
iter = 2000,
warmup = 500,
chains = 4,
seed = 2026)
## Trying to compile a simple C file
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## using C compiler: ‘Apple clang version 16.0.0 (clang-1600.0.26.6)’
## using SDK: ‘MacOSX15.2.sdk’
## clang -arch x86_64 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/Rcpp/include/" -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/" -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/unsupported" -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/src/" -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/" -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppParallel/include/" -I"/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/rstan/include" -DEIGEN_NO_DEBUG -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -DSTAN_THREADS -DUSE_STANC3 -DSTRICT_R_HEADERS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -D_HAS_AUTO_PTR_ETC=0 -include '/Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp' -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1 -I/opt/R/x86_64/include -fPIC -falign-functions=64 -Wall -g -O2 -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/Eigen/Core:19:
## /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: 'cmath' file not found
## 679 | #include <cmath>
## | ^~~~~~~
## 1 error generated.
## make: *** [foo.o] Error 1
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 8e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.8 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 501 / 2000 [ 25%] (Sampling)
## Chain 1: Iteration: 700 / 2000 [ 35%] (Sampling)
## Chain 1: Iteration: 900 / 2000 [ 45%] (Sampling)
## Chain 1: Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 1: Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 1: Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 1: Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 1: Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.411 seconds (Warm-up)
## Chain 1: 0.881 seconds (Sampling)
## Chain 1: 1.292 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 1.8e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 501 / 2000 [ 25%] (Sampling)
## Chain 2: Iteration: 700 / 2000 [ 35%] (Sampling)
## Chain 2: Iteration: 900 / 2000 [ 45%] (Sampling)
## Chain 2: Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 2: Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 2: Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 2: Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 2: Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.574 seconds (Warm-up)
## Chain 2: 0.894 seconds (Sampling)
## Chain 2: 1.468 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1.6e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 501 / 2000 [ 25%] (Sampling)
## Chain 3: Iteration: 700 / 2000 [ 35%] (Sampling)
## Chain 3: Iteration: 900 / 2000 [ 45%] (Sampling)
## Chain 3: Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 3: Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 3: Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 3: Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 3: Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.526 seconds (Warm-up)
## Chain 3: 0.887 seconds (Sampling)
## Chain 3: 1.413 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.6e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 501 / 2000 [ 25%] (Sampling)
## Chain 4: Iteration: 700 / 2000 [ 35%] (Sampling)
## Chain 4: Iteration: 900 / 2000 [ 45%] (Sampling)
## Chain 4: Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 4: Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 4: Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 4: Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 4: Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.793 seconds (Warm-up)
## Chain 4: 0.964 seconds (Sampling)
## Chain 4: 1.757 seconds (Total)
## Chain 4:
Take a look at the model output
print(fit_stan)
## Inference for Stan model: anon_model.
## 4 chains, each with iter=2000; warmup=500; thin=1;
## post-warmup draws per chain=1500, total post-warmup draws=6000.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
## alpha -6.61 0.08 3.13 -12.65 -8.76 -6.66 -4.50 -0.44 1482 1
## beta 0.25 0.00 0.02 0.22 0.24 0.25 0.26 0.28 1488 1
## sigma 4.13 0.00 0.16 3.84 4.02 4.13 4.24 4.47 2421 1
## lp__ -643.69 0.03 1.26 -646.89 -644.26 -643.38 -642.77 -642.28 1795 1
##
## Samples were drawn using NUTS(diag_e) at Sat Mar 7 13:11:41 2026.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
print(fit_stan, pars = c("alpha", "beta", "sigma"), probs = c(0.1, 0.5, 0.9), digits = 4)
## Inference for Stan model: anon_model.
## 4 chains, each with iter=2000; warmup=500; thin=1;
## post-warmup draws per chain=1500, total post-warmup draws=6000.
##
## mean se_mean sd 10% 50% 90% n_eff Rhat
## alpha -6.6120 0.0813 3.1288 -10.5958 -6.6642 -2.5838 1482 1.0032
## beta 0.2518 0.0004 0.0155 0.2319 0.2520 0.2715 1488 1.0032
## sigma 4.1342 0.0033 0.1620 3.9315 4.1309 4.3415 2421 1.0000
##
## Samples were drawn using NUTS(diag_e) at Sat Mar 7 13:11:41 2026.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
extract and plot posterior samples for slope coefficient
post_slope <- extract(fit_stan,
pars = "beta") %>%
as.data.frame() %>%
pull(beta)
post_slope %>%
hist()
quantile(post_slope, probs = c(0.05, 0.25, 0.5, 0.75, 0.95))
## 5% 25% 50% 75% 95%
## 0.2257220 0.2412302 0.2520162 0.2624716 0.2776138
Get the posterior draw for all parameters
posterior_draws <- as.data.frame(fit_stan)
head(posterior_draws)
## alpha beta sigma lp__
## 1 -4.946153 0.2436300 4.163618 -642.3401
## 2 -8.169144 0.2600534 4.248282 -642.7445
## 3 -6.109667 0.2497968 4.022110 -642.4536
## 4 -5.072701 0.2432961 4.112863 -642.6053
## 5 -3.081222 0.2350112 4.167247 -642.9985
## 6 -7.130912 0.2539639 4.150874 -642.2852
colMeans(posterior_draws)
## alpha beta sigma lp__
## -6.6120060 0.2518128 4.1342369 -643.6944364
apply(X = posterior_draws, MARGIN = 2, FUN = sd)
## alpha beta sigma lp__
## 3.12879034 0.01552891 0.16203310 1.25605985
Trace Plots of the MCMC
mcmc_trace(as.array(fit_stan), pars = c("alpha", "beta", "sigma"))
Density Plots of the Model Parameters
mcmc_dens_overlay(as.array(fit_stan), pars = c("alpha", "beta", "sigma"))
That’s a quick walk through setting up a STAN model and
getting it running. In future blog posts we will extend this model to a
more complex structure, talk about how to handle categorical predictors
within STAN, discuss model diagnostics, discuss making
predictions with a STAN model, and show how we can easily
transition between R and Python with our Bayesian models with limited
difficulty.