One commonly requested feature is to be able to run a post hoc Markov chain Monte Carlo analysis based on the results of a frequentist fit. This is often a reasonable shortcut for computing confidence intervals and p-values that allow for finite-sized samples rather than relying on asymptotic sampling distributions. This vignette shows an example of such an analysis. Some caveats:
Load packages:
library(glmmTMB)
library(coda) ## MCMC utilities
library(reshape2) ## for melt()
## graphics
library(lattice)
library(ggplot2); theme_set(theme_bw())
Fit basic model:
data("sleepstudy",package="lme4")
fm1 <- glmmTMB(Reaction ~ Days + (Days|Subject),
sleepstudy)
Set up for MCMC: define scaled log-posterior function (in this case the log-likelihood function); extract coefficients and variance-covariance matrices as starting points.
## FIXME: is there a better way for user to extract full coefs?
rawcoef <- with(fm1$obj$env,last.par[-random])
names(rawcoef) <- make.names(names(rawcoef),unique=TRUE)
## log-likelihood function
## (MCMCmetrop1R wants *positive* log-lik)
logpost_fun <- function(x) -fm1$obj$fn(x)
## check definitions
stopifnot(all.equal(c(logpost_fun(rawcoef)),
c(logLik(fm1)),
tolerance=1e-7))
V <- vcov(fm1,full=TRUE)
This is a basic block Metropolis sampler, based on Florian Hartig's code here.
##' @param start starting value
##' @param V variance-covariance matrix of MVN candidate distribution
##' @param iterations total iterations
##' @param nsamp number of samples to store
##' @param burnin number of initial samples to discard
##' @param thin thinning interval
##' @param tune tuning parameters; expand/contract V
##' @param seed random-number seed
run_MCMC <- function(start,
V,
logpost_fun,
iterations = 10000,
nsamp = 1000,
burnin = iterations/2,
thin = floor((iterations-burnin)/nsamp),
tune = NULL,
seed=NULL
) {
## initialize
if (!is.null(seed)) set.seed(seed)
if (!is.null(tune)) {
tunesq <- if (length(tune)==1) tune^2 else outer(tune,tune)
V <- V*tunesq
}
chain <- matrix(NA, nsamp+1, length(start))
chain[1,] <- cur_par <- start
postval <- logpost_fun(cur_par)
j <- 1
for (i in 1:iterations) {
proposal = MASS::mvrnorm(1,mu=cur_par, Sigma=V)
newpostval <- logpost_fun(proposal)
accept_prob <- exp(newpostval - postval)
if (runif(1) < accept_prob) {
cur_par <- proposal
postval <- newpostval
}
if ((i>burnin) && (i %% thin == 1)) {
chain[j+1,] <- cur_par
j <- j + 1
}
}
chain <- na.omit(chain)
colnames(chain) <- names(start)
chain <- coda::mcmc(chain)
return(chain)
}
Run the chain:
t1 <- system.time(m1 <- run_MCMC(start=rawcoef,
V=V, logpost_fun=logpost_fun,
seed=1001))
(running this chain takes 4.5 seconds)
Add more informative names and transform correlation parameter (see vignette on covariance structures and parameters):
colnames(m1) <- c(names(fixef(fm1)[[1]]),
"log(sigma)",
c("log(sd_Intercept)","log(sd_Days)","cor"))
m1[,"cor"] <- sapply(m1[,"cor"], get_cor)
xyplot(m1,layout=c(2,3),asp="fill")
The trace plots are poor, especially for the correlation; the effective sample size backs this up, as would any other diagnostics we did.
print(effectiveSize(m1),digits=3)
In a real analysis we would stop and fix the mixing/convergence problems before proceeding; for this simple sampler, some of our choices would be (1) simply run the chain for longer; (2) tune the candidate distribution (e.g. by using tune
to scale some parameters, or perhaps by switching to a multivariate Student t distribution [see the mvtnorm
package]); (3) add regularizing priors.
Ignoring the problems and proceeding, we can compute column-wise quantiles or highest posterior density intervals (coda::HPDinterval
) to get confidence intervals. Plotting posterior distributions, omitting the intercept because it's on a very different scale.
The tmbstan
package allows direct, simple access to a hybrid/Hamiltonian Monte Carlo algorithm for sampling from a TMB object; the $obj
component of a glmmTMB
fit is such an object. (To run this example you'll need to install the tmbstan
package and its dependencies.)
## install.packages("tmbstan")
library(tmbstan)
t2 <- system.time(m2 <- tmbstan(fm1$obj))
(running this command, which creates 4 chains, takes 336.1 seconds)
However, there are many indications (warning messages; trace plots) that the correlation parameter needs to be given a more informative prior. (In the plot below, the correlation parameter is shown on its unconstrained scale; the actual correlation would be \(\theta_3/\sqrt{1+\theta_3^2}\).)
Owls