(Generalized) Linear Mixed Models

06 May, 2022

Linear models

How maximize power?

Linear models

How maximize power?

increase replication
add covariates (account for conditions)
block (control conditions)

Hierarchical models

How do we increase power - without more sites (replicates)

Hierarchical models

Subreplicates - yet these are not independent

Nested design

Hierarchical models

How do we increase power…

Hierarchical models

Block treatments together - yet these are not independent

Randomized complete design

Hierarchical models

Linear modelling assumptions

Normality
Homogeneity of Variance
Linearity
Independence

Non-independence

one response is triggered by another
temporal/spatial autocorrelation
nested (hierarchical) design structures

Hierarchical models

linear model with separate covariance structure per block
fixed and random factors (effects)

Example

#data.rcb <- read.csv('../data/data.rcb.csv')
head(data.rcb)

##          y        x  block
## 1 281.1091 18.58561 Block1
## 2 295.6535 26.04867 Block1
## 3 328.3234 40.09974 Block1
## 4 360.1672 63.57455 Block1
## 5 276.7050 14.11774 Block1
## 6 348.9709 62.88728 Block1

Example

library(ggplot2)
ggplot(data.rcb, aes(y=y, x=x)) + geom_point() + geom_smooth(method='lm')

Example

library(ggplot2)
ggplot(data.rcb, aes(y=y, x=x,color=block))+geom_point()+
    geom_smooth(method='lm')

Example

Simple linear regression - wrong

data.rcb.lm <- lm(y~x, data.rcb)

plot(residuals(data.rcb.lm) ~ data.rcb$block)

Example

Simple ANCOVA - wrong

data.rcb.lm1 <- lm(y~x+block, data=data.rcb)

plot(residuals(data.rcb.lm1) ~ data.rcb$block)

Example

Looks good, but for INDEPENDENCE
Can we deal with that with correlation structure?

\[ \text{Variance-covariance per Block:} \mathbf{V} = \begin{pmatrix} \sigma^2&\rho&\cdots&\rho\\ \rho&\sigma^2&\cdots&\vdots\\ \vdots&\cdots&\sigma^2&\vdots\\ \rho&\cdots&\cdots&\sigma^2\\ \end{pmatrix} \]

Linear mixed effects model

library(nlme)
data.rcb.lme <- lme(y~x, random=~1|block, data.rcb, method='REML')
plot(data.rcb.lme)

plot(residuals(data.rcb.lme, type='normalized') ~ fitted(data.rcb.lme))

plot(residuals(data.rcb.lme, type='normalized') ~ data.rcb$block)

Linear mixed effects model

summary(data.rcb.lme)

## Linear mixed-effects model fit by REML
##   Data: data.rcb 
##        AIC      BIC   logLik
##   458.9521 467.1938 -225.476
## 
## Random effects:
##  Formula: ~1 | block
##         (Intercept) Residual
## StdDev:    18.10888 8.905485
## 
## Fixed effects:  y ~ x 
##                Value Std.Error DF  t-value p-value
## (Intercept) 232.8193  7.823393 53 29.75937       0
## x             1.4591  0.063789 53 22.87392       0
##  Correlation: 
##   (Intr)
## x -0.292
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -2.09947262 -0.57994305 -0.04874031  0.56685096  2.49464217 
## 
## Number of Observations: 60
## Number of Groups: 6

Linear mixed effects model

anova(data.rcb.lme)

##             numDF denDF   F-value p-value
## (Intercept)     1    53 1452.2883  <.0001
## x               1    53  523.2164  <.0001

intervals(data.rcb.lme)

## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                  lower       est.      upper
## (Intercept) 217.127551 232.819291 248.511031
## x             1.331156   1.459101   1.587045
## 
##  Random Effects:
##   Level: block 
##                    lower     est.    upper
## sd((Intercept)) 9.597555 18.10888 34.16822
## 
##  Within-group standard error:
##     lower      est.     upper 
##  7.361788  8.905485 10.772879

Linear mixed effects model

vc<-as.numeric(as.matrix(VarCorr(data.rcb.lme))[,1])
vc/sum(vc)

## [1] 0.8052553 0.1947447

Linear mixed effects model

library(effects)
plot(allEffects(data.rcb.lme, partial.residuals=TRUE))

Linear mixed effects model

predict(data.rcb.lme, newdata=data.frame(x=30:40),level=0)

##  [1] 276.5923 278.0514 279.5105 280.9696 282.4287 283.8878 285.3469 286.8060
##  [9] 288.2651 289.7242 291.1833
## attr(,"label")
## [1] "Predicted values"

Linear mixed effects model

predict(data.rcb.lme, newdata=data.frame(x=30:40,
                          block='Block1'),level=1)

##   Block1   Block1   Block1   Block1   Block1   Block1   Block1   Block1 
## 302.7422 304.2013 305.6604 307.1195 308.5786 310.0377 311.4968 312.9559 
##   Block1   Block1   Block1 
## 314.4150 315.8741 317.3332 
## attr(,"label")
## [1] "Predicted values"

Generalized Linear Mixed Effects Models

Parameter Estimation

lm –> LME
(integrate likelihood across all unobserved levels random effects)

Parameter Estimation

lm –> LME
(integrate likelihood across all unobserved levels random effects)

glm —-………–> GLMM
Not so easy - need to approximate

Parameter Estimation

Penalized quasi-likelihood
Laplace approximation
Gauss-Hermite quadrature

Penalized quasi-likelihood (PQL)

Iterative (re)weighting

LMM to estimate vcov structure
fixed effects estimated by fitting GLM (incorp vcov)
refit LMM to re-estimate vcov
cycle

Penalized quasi-likelihood (PQL)

Advantages

relatively simple
leverage variance-covariance structures for heterogeneity and dependency structures

Disadvantages

biased when expected values less <5
approximates likelihood (no AIC or LTR)

Laplace approximation

Second-order Taylor series expansion - to approximate likelihood at unobserved levels of random effects

Laplace approximation

Second-order Taylor series expansion - to approximate likelihood at unobserved levels of random effects

Advantages

more accurate

Laplace approximation

Second-order Taylor series expansion - to approximate likelihood at unobserved levels of random effects

Advantages

more accurate

Disadvantages

slower
no way to incorporate vcov

Gauss-Hermite quadrature (GHQ)

approximates value of integrals at specific points (quadratures)
points (and weights) selected by optimizer

Gauss-Hermite quadrature (GHQ)

approximates value of integrals at specific points (quadratures)
points (and weights) selected by optimizer

Advantages

even more accurate

Gauss-Hermite quadrature (GHQ)

approximates value of integrals at specific points (quadratures)
points (and weights) selected by optimizer

Advantages

even more accurate

Disadvantages

even slower
no way to incorporate vcov

Markov Chain Monte Carlo (MCMC)

recreate likelihood by sampling proportionally to likelihood

Markov Chain Monte Carlo (MCMC)

recreate likelihood by sampling proportionally to likelihood

Advantages

very accurate (not an approximation)
very robust

Markov Chain Monte Carlo (MCMC)

recreate likelihood by sampling proportionally to likelihood

Advantages

very accurate (not an approximation)
very robust

Disadvantages

very slow
currently complex

Inference (hypothesis) testing

GLMM

Depends on:

Estimation engine (PQL, Laplace, GHQ)
Overdispersed
Fixed or random factors

Inference (hypothesis) testing

Approximation	Characteristics	Associated inference	R function
Penalized Quasi-likelihood (PQL)	Fast and simple, accommodates heterogeneity and dependency structures, biased for small samples	Wald tests only	`glmmPQL` (MASS)
Laplace	More accurate (less biased), slower, does not accommodates heterogeneity and dependency structures	LRT	`glmer` (lme4), `glmmTMB` (glmmTMB)
Gauss-Hermite quadrature	Even more accurate (less biased), even slower, does not accommodates heterogeneity and dependency structures, cant handle more than 1 random effect	LRT	`glmer` (lme4)?? Does not seem to work
Markov Chain Monte Carlo (MCMC)	Bayesian, very flexible and accurate, yet very slow and complex	Bayesian credibility intervals, Bayesian P-values	Numerous

Inference (hypothesis) testing

Feature	`glmmPQL` (MASS)	`glmer` (lme4)	`glmmTMB` (glmmTMB)	MCMC
Variance and covariance structures	Yes	-	not yet	Yes
Overdispersed (Quasi) families	Yes	-	-	Yes
Mixture families	limited	limited	some	Yes
Complex nesting	Yes	Yes	Yes	Yes
Zero-inflation	-	-	Yes	Yes
Resid degrees of freedom	Between-Within	-	-	-
Parameter tests	Wald $t$	Wald $z$	Wald $z$	UI
Marginal tests (fixed effects)	Wald $F$, $\chi^2$	Wald $F$, $\chi^2$	Wald $F$, $\chi^2$	UI
Marginal tests (Random effects)	Wald $F$,$\chi^2$	LRT	LRT	UI
Information criterion	-	AIC	AIC	AIC, WAIC

Inference (hypothesis) testing

Additional assumptions

dispersion
(multi)collinearity
design balance and Type III (marginal) SS
heteroscadacity
spatial/temporal autocorrelation