Stan is a new, open source, Bayesian inference tool. Stan is based on the the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. In order to compare Stan with JAGS, a gibbs sampler approach to Bayesian inference, I used the classic WinBUGS rats example. The rats example uses a hierarchical model to look at the growth curves of 30 rats.

In this example, I used R with RStan and R2jags packages. I used the code on the RStan Getting Started and the examples from the JAGS sourceforge page as starting points. I used a very large number of samples to test run times. Data and model files needed are below.

rats.csv

rats.bug

rats.txt

rats.stan

Both RStan and R2jags use the same list structure for the model data. In in R2jags, you need to specify which variables you wish to monitor, while RStan defaults to all of the variables. Both require an external model file and have very similar commands to fit the model. So, within R, both packages are very similar.

When you run the code, you will note that JAGS starts the sampling immediately, while Stan requires a lengthy compilation step. You will also see that I used two different calls to "stan".

```
stan.fit <- stan(file = 'rats.stan', data = rats_dat, verbose = FALSE,
warmup=1000,iter = 11000, n_chains = 3)
stan.fit2 <- stan(fit=stan.fit, data = rats_dat, verbose = FALSE,
warmup=1000,iter = 11000, n_chains = 3)
```

The first compiles the model and runs the sampler, while the second uses the model from the first command. This eliminates the need to recompile the model, saving a good chunk of time. Notice the first argument has changed from "file=" to "fit=". I also discovered that adding the "thin" option to either "stan" call creates a segfault, which I have reported the the Stan maintainers.

The answers that each program produced were identical (to the first decimal, at least). Both gave means, standard deviations, and key quantiles of the posterior distributions of the parameters, as well as R-hat, a measure of convergence you would like close to 1. The plots are slightly different, with RStan color coding the R-hat value vs. reporting it for R2jags. The output from RStan does not include a DIC value, at least by default.

**R2jags Output**

```
Inference for Bugs model at "rats.bug", fit using jags,
3 chains, each with 10000 iterations (first 1000 discarded), n.thin = 9
n.sims = 3000 iterations saved
mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat
alpha.c 242.390 3.712 237.001 240.654 242.431 244.237 247.839 1.129
alpha0 106.316 4.508 99.444 103.915 106.331 108.864 113.246 1.043
beta.c 6.185 0.109 5.968 6.116 6.185 6.258 6.394 1.002
sigma 6.136 1.182 5.266 5.776 6.065 6.395 7.088 1.001
tau.alpha 0.005 0.001 0.003 0.004 0.005 0.006 0.008 1.001
tau.beta 4.137 1.538 1.904 3.073 3.902 4.904 7.857 1.001
tau.c 0.027 0.004 0.020 0.024 0.027 0.030 0.036 1.001
deviance 967.610 23.168 941.121 957.316 966.043 976.319 998.467 1.001
n.eff
alpha.c 3000
alpha0 3000
beta.c 1400
sigma 3000
tau.alpha 3000
tau.beta 3000
tau.c 3000
deviance 3000
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = var(deviance)/2)
pD = 268.5 and DIC = 1236.1
DIC is an estimate of expected predictive error (lower deviance is better).
```

**RStan output**

```
Inference for Stan model: anon_model.
3 chains: each with iter=11000; warmup=1000; thin=1; 11000 iterations saved.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff
alpha[1] 239.9 0.0 2.7 234.6 238.1 239.9 241.7 245.1 15672
alpha[2] 247.8 0.0 2.7 242.6 246.0 247.8 249.6 253.1 16865
alpha[3] 252.4 0.0 2.7 247.1 250.6 252.4 254.2 257.7 20956
alpha[4] 232.6 0.0 2.7 227.3 230.8 232.5 234.3 237.8 21050
alpha[5] 231.6 0.0 2.7 226.4 229.8 231.6 233.4 236.8 20250
alpha[6] 249.7 0.0 2.7 244.5 247.9 249.7 251.5 255.1 18570
alpha[7] 228.7 0.0 2.7 223.4 226.9 228.7 230.5 234.0 21257
alpha[8] 248.4 0.0 2.7 243.0 246.6 248.4 250.2 253.7 16278
alpha[9] 283.3 0.0 2.7 278.0 281.5 283.3 285.1 288.6 19244
alpha[10] 219.2 0.0 2.7 213.9 217.4 219.2 221.0 224.6 17471
alpha[11] 258.2 0.0 2.7 252.9 256.4 258.2 260.1 263.6 21935
alpha[12] 228.2 0.0 2.7 222.9 226.4 228.2 230.0 233.5 19933
alpha[13] 242.4 0.0 2.7 237.1 240.6 242.4 244.2 247.7 14641
alpha[14] 268.3 0.0 2.7 262.9 266.4 268.3 270.1 273.7 20241
alpha[15] 242.8 0.0 2.7 237.6 241.0 242.8 244.6 248.0 14321
alpha[16] 245.3 0.0 2.7 239.9 243.5 245.3 247.1 250.5 15314
alpha[17] 232.2 0.0 2.7 227.0 230.4 232.2 234.0 237.4 20874
alpha[18] 240.5 0.0 2.7 235.2 238.6 240.5 242.3 245.6 16301
alpha[19] 253.8 0.0 2.7 248.5 252.0 253.8 255.6 259.1 22001
alpha[20] 241.6 0.0 2.7 236.2 239.8 241.6 243.4 246.9 14330
alpha[21] 248.5 0.0 2.7 243.3 246.8 248.5 250.3 253.8 19178
alpha[22] 225.2 0.0 2.7 220.0 223.4 225.2 227.1 230.5 19358
alpha[23] 228.5 0.0 2.7 223.2 226.7 228.5 230.3 233.8 19742
alpha[24] 245.1 0.0 2.7 239.9 243.3 245.1 246.9 250.2 14794
alpha[25] 234.5 0.0 2.7 229.2 232.7 234.5 236.2 239.7 18778
alpha[26] 254.0 0.0 2.7 248.7 252.2 254.0 255.8 259.2 19957
alpha[27] 254.4 0.0 2.7 249.1 252.6 254.4 256.2 259.7 21171
alpha[28] 243.0 0.0 2.7 237.7 241.2 243.0 244.8 248.3 14643
alpha[29] 217.9 0.0 2.7 212.6 216.1 217.9 219.7 223.3 18663
alpha[30] 241.4 0.0 2.6 236.3 239.6 241.4 243.2 246.6 16161
beta[1] 6.1 0.0 0.2 5.6 5.9 6.1 6.2 6.5 1717
beta[2] 7.1 0.0 0.3 6.6 6.9 7.1 7.2 7.5 1052
beta[3] 6.5 0.0 0.2 6.0 6.3 6.5 6.6 7.0 1218
beta[4] 5.3 0.0 0.3 4.8 5.2 5.3 5.5 5.8 1187
beta[5] 6.6 0.0 0.2 6.1 6.4 6.6 6.7 7.0 1936
beta[6] 6.2 0.0 0.2 5.7 6.0 6.2 6.3 6.7 1651
beta[7] 6.0 0.0 0.2 5.5 5.8 6.0 6.1 6.4 1702
beta[8] 6.4 0.0 0.2 5.9 6.3 6.4 6.6 6.9 1926
beta[9] 7.1 0.0 0.3 6.6 6.9 7.1 7.2 7.6 1444
beta[10] 5.8 0.0 0.2 5.4 5.7 5.8 6.0 6.3 1514
beta[11] 6.8 0.0 0.2 6.3 6.6 6.8 7.0 7.3 780
beta[12] 6.1 0.0 0.2 5.7 6.0 6.1 6.3 6.6 1924
beta[13] 6.2 0.0 0.2 5.7 6.0 6.2 6.3 6.6 1283
beta[14] 6.7 0.0 0.3 6.2 6.5 6.7 6.9 7.2 1523
beta[15] 5.4 0.0 0.3 4.9 5.2 5.4 5.6 5.9 1000
beta[16] 5.9 0.0 0.2 5.4 5.8 5.9 6.1 6.4 1920
beta[17] 6.3 0.0 0.2 5.8 6.1 6.3 6.4 6.8 2247
beta[18] 5.8 0.0 0.2 5.4 5.7 5.8 6.0 6.3 1694
beta[19] 6.4 0.0 0.2 5.9 6.3 6.4 6.6 6.9 1184
beta[20] 6.1 0.0 0.2 5.6 5.9 6.1 6.2 6.5 1666
beta[21] 6.4 0.0 0.2 5.9 6.2 6.4 6.6 6.9 1618
beta[22] 5.9 0.0 0.2 5.4 5.7 5.9 6.0 6.3 2248
beta[23] 5.8 0.0 0.2 5.3 5.6 5.8 5.9 6.2 1597
beta[24] 5.9 0.0 0.2 5.4 5.7 5.9 6.0 6.4 2203
beta[25] 6.9 0.0 0.2 6.4 6.7 6.9 7.1 7.4 1778
beta[26] 6.5 0.0 0.2 6.1 6.4 6.5 6.7 7.0 1912
beta[27] 5.9 0.0 0.2 5.4 5.7 5.9 6.1 6.4 1546
beta[28] 5.8 0.0 0.2 5.4 5.7 5.8 6.0 6.3 1776
beta[29] 5.7 0.0 0.2 5.2 5.5 5.7 5.8 6.2 1514
beta[30] 6.1 0.0 0.2 5.7 6.0 6.1 6.3 6.6 1630
mu_alpha 242.4 0.0 2.8 236.9 240.6 242.4 244.3 247.9 12061
mu_beta 6.2 0.0 0.1 6.0 6.1 6.2 6.3 6.4 2855
sigmasq_y 37.3 0.1 5.7 27.8 33.2 36.7 40.7 50.0 2879
sigmasq_alpha 219.1 1.2 64.6 125.7 173.1 208.1 252.0 374.3 2934
sigmasq_beta 0.3 0.0 0.1 0.1 0.2 0.3 0.3 0.5 408
sigma_y 6.1 0.0 0.5 5.3 5.8 6.1 6.4 7.1 2864
sigma_alpha 14.7 0.0 2.1 11.2 13.2 14.4 15.9 19.3 2966
sigma_beta 0.5 0.0 0.1 0.4 0.5 0.5 0.6 0.7 402
alpha0 106.3 0.1 3.7 99.0 103.9 106.3 108.8 113.6 5365
lp__ -438.4 0.1 6.9 -453.0 -442.7 -437.9 -433.5 -426.2 2614
Rhat
alpha[1] 1
alpha[2] 1
alpha[3] 1
alpha[4] 1
alpha[5] 1
alpha[6] 1
alpha[7] 1
alpha[8] 1
alpha[9] 1
alpha[10] 1
alpha[11] 1
alpha[12] 1
alpha[13] 1
alpha[14] 1
alpha[15] 1
alpha[16] 1
alpha[17] 1
alpha[18] 1
alpha[19] 1
alpha[20] 1
alpha[21] 1
alpha[22] 1
alpha[23] 1
alpha[24] 1
alpha[25] 1
alpha[26] 1
alpha[27] 1
alpha[28] 1
alpha[29] 1
alpha[30] 1
beta[1] 1
beta[2] 1
beta[3] 1
beta[4] 1
beta[5] 1
beta[6] 1
beta[7] 1
beta[8] 1
beta[9] 1
beta[10] 1
beta[11] 1
beta[12] 1
beta[13] 1
beta[14] 1
beta[15] 1
beta[16] 1
beta[17] 1
beta[18] 1
beta[19] 1
beta[20] 1
beta[21] 1
beta[22] 1
beta[23] 1
beta[24] 1
beta[25] 1
beta[26] 1
beta[27] 1
beta[28] 1
beta[29] 1
beta[30] 1
mu_alpha 1
mu_beta 1
sigmasq_y 1
sigmasq_alpha 1
sigmasq_beta 1
sigma_y 1
sigma_alpha 1
sigma_beta 1
alpha0 1
lp__ 1
Sample were drawn using NUTS2 at Wed Sep 5 21:55:07 2012.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
```

One interesting difference between the two outputs is the R-hat value for the alpha0 variable. Alpha0 is generated quantity, representing the value of the intercept when non-centered data is used. For R2jags, the value of R-hat is 1.228, while R-hat is 1 for RStan. A quick look at the output indicates that R2jags used a thin value of 9, while RStan defaults to 1 (and creates a segfault when changed).

There is a major speed difference between the two methods. Now, it should be noted that JAGS can be faster on some problems and Stan can be faster on others. On my machine, the JAGS sampler took 7.1 seconds, while the Stan sampler took 25 seconds, which was a bit shocking to me. And this was timed using the second "stan" call, when the Stan model was pre-compiled. This is something that needs to be explored further.