This chapter reproduces the example from the previous chapter. The reader should consult that chapter for an explanation of one-way analysis of variance with blocks. Here, the analysis is done with a mixed effects model, with the treatments treated as a fixed effect and the blocks treated as a random effect.
In analysis of variance, blocking variables are often treated as random variables. This is because the blocking variable represents a random selection of levels of that variable. The analyst wants to take the effects of the blocking variable into account, but the identity of the specific levels of the blocks are not of interest.
In the example in this chapter, the instructors are focused on the effect of their different nutrition education programs, which are the treatments; they are not concerned about the effect of one specific town or another per se, but do want to want to take into account any differences due to the different towns.
For a more complete discussion of random effects, see the Using Random Effects in Models chapter.
Packages used in this chapter
The packages used in this chapter include:
• psych
• lme4
• lmerTest
• multcompView
• lsmeans
• nlme
• car
• rcompanion
The following commands will install these packages if they are not already installed:
if(!require(psych)){install.packages("psych")}
if(!require(lme4)){install.packages("lme4")}
if(!require(lmerTest)){install.packages("lmerTest")}
if(!require(multcompView)){install.packages("multcompView")}
if(!require(lsmeans)){install.packages("lsmeans")}
if(!require(nlme)){install.packages("nlme")}
if(!require(car)){install.packages("car")}
if(!require(rcompanion)){install.packages("rcompanion")}
One-way ANOVA with random blocks example
Input = ("
Instructor Town Sodium
'Brendon Small' Squiggleville 1200
'Brendon Small' Squiggleville 1400
'Brendon Small' Squiggleville 1350
'Brendon Small' Metalocalypse 950
'Brendon Small' Squiggleville 1400
'Brendon Small' Squiggleville 1150
'Brendon Small' Squiggleville 1300
'Brendon Small' Metalocalypse 1325
'Brendon Small' Metalocalypse 1425
'Brendon Small' Squiggleville 1500
'Brendon Small' Squiggleville 1250
'Brendon Small' Metalocalypse 1150
'Brendon Small' Metalocalypse 950
'Brendon Small' Squiggleville 1150
'Brendon Small' Metalocalypse 1600
'Brendon Small' Metalocalypse 1300
'Brendon Small' Metalocalypse 1050
'Brendon Small' Metalocalypse 1300
'Brendon Small' Squiggleville 1700
'Brendon Small' Squiggleville 1300
'Coach McGuirk' Squiggleville 1100
'Coach McGuirk' Squiggleville 1200
'Coach McGuirk' Squiggleville 1250
'Coach McGuirk' Metalocalypse 1050
'Coach McGuirk' Metalocalypse 1200
'Coach McGuirk' Metalocalypse 1250
'Coach McGuirk' Squiggleville 1350
'Coach McGuirk' Squiggleville 1350
'Coach McGuirk' Squiggleville 1325
'Coach McGuirk' Squiggleville 1525
'Coach McGuirk' Squiggleville 1225
'Coach McGuirk' Squiggleville 1125
'Coach McGuirk' Metalocalypse 1000
'Coach McGuirk' Metalocalypse 1125
'Coach McGuirk' Squiggleville 1400
'Coach McGuirk' Metalocalypse 1200
'Coach McGuirk' Squiggleville 1150
'Coach McGuirk' Squiggleville 1400
'Coach McGuirk' Squiggleville 1500
'Coach McGuirk' Squiggleville 1200
'Melissa Robins' Metalocalypse 900
'Melissa Robins' Metalocalypse 1100
'Melissa Robins' Metalocalypse 1150
'Melissa Robins' Metalocalypse 950
'Melissa Robins' Metalocalypse 1100
'Melissa Robins' Metalocalypse 1150
'Melissa Robins' Squiggleville 1250
'Melissa Robins' Squiggleville 1250
'Melissa Robins' Squiggleville 1225
'Melissa Robins' Squiggleville 1325
'Melissa Robins' Metalocalypse 1125
'Melissa Robins' Metalocalypse 1025
'Melissa Robins' Metalocalypse 950
'Melissa Robins' Metalocalypse 925
'Melissa Robins' Squiggleville 1200
'Melissa Robins' Metalocalypse 1100
'Melissa Robins' Metalocalypse 950
'Melissa Robins' Metalocalypse 1300
'Melissa Robins' Squiggleville 1400
'Melissa Robins' Metalocalypse 1100
")
Data = read.table(textConnection(Input),header=TRUE)
### Order factors by the order in data frame
### Otherwise, R will alphabetize them
Data$Instructor = factor(Data$Instructor,
levels=unique(Data$Instructor))
Data$Town = factor(Data$Town,
levels=unique(Data$Town))
### Check the data frame
library(psych)
headTail(Data)
str(Data)
summary(Data)
### Remove unnecessary objects
rm(Input)
Mixed-effects model with lmer
In this first example, the model will be specified with the lmer function in the package lme4. The term (1|Town) in the model formula indicates that Town should be treated as a random variable, with each level having its own intercept in the model. The anova function in the package lmerTest is used to produce p-values for the fixed effects. The rand function in the package lmerTest produces p-values for the random effects.
Technical note
lmerTest::anova by default returns p-values for Type III sum of squares with a Satterthwaite approximation for the degrees of freedom.
library(lme4)
library(lmerTest)
model = lmer(Sodium ~ Instructor + (1|Town),
data=Data,
REML=TRUE)
anova(model)
Analysis of Variance Table of type III with Satterthwaite
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
Instructor 154766 77383 2 56.29 3.7413 0.02982 *
rand(model)
Analysis of Random effects Table:
Chi.sq Chi.DF p.value
Town 10.5 1 0.001 **
p-value and pseudo R-squared for model
To use the nagelkerke function with an lmer object, the null model with which to compare the lmer model has to be specified explicitly.
One approach is to define the null model as one with no fixed effects except for an intercept, indicated with a 1 on the right side of the ~. And to also include the random effects, in this case (1|Town).
model.null =
lmer(Sodium ~ 1 + (1|Town),
data = Data,
REML = TRUE)
anova(model,
model.null)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
..1 3 782.34 788.63 -388.17 776.34
object 5 778.82 789.29 -384.41 768.82 7.5255 2 0.02322 *
library(rcompanion)
nagelkerke(fit = model,
null = model.null)
$Pseudo.R.squared.for.model.vs.null
Pseudo.R.squared
McFadden 0.00971559
Cox and Snell (ML) 0.11818400
Nagelkerke (Cragg and Uhler) 0.11818400
Another approach to determining the p-value and pseudo R-squared for an lmer model is to compare the model to a null model with only an intercept and neither the fixed nor the random effects. To accomplish this, the null model has to be specified with lm and not lmer. However, it may not be permissible to compare models specified with different functions in this way. The anova function in lmerTest does allow these comparisons. The nagelkerke function will also allow these comparisons, although I do not know if the results are valid.
model.null.2
= lm(Sodium ~ 1,
data = Data)
anova(model,
model.null.2)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
..1 2 792.07 796.26 -394.04 788.07
object 5 778.82 789.29 -384.41 768.82 19.253 3 0.0002424 ***
library(rcompanion)
nagelkerke(fit = model,
null = model.null.2)
$Pseudo.R.squared.for.model.vs.null
Pseudo.R.squared
McFadden 0.0239772
Cox and Snell (ML) 0.2701590
Nagelkerke (Cragg and Uhler) 0.2701600
Post-hoc analysis
The lsmeans package is able to handle lmer objects. For further details, see ?lsmeans::models. For a review of mean separation tests and least square means, see the chapters What are Least Square Means?; the chapter Least Square Means for Multiple Comparisons; and the “Post-hoc analysis: mean separation tests” section in the One-way ANOVA chapter.
The lmerTest package has a function Difflsmeans which also compares marginal means for treatments, but doesn’t include output for a compact letter display or adjusted p-values.
library(multcompView)
library(lsmeans)
marginal = lsmeans(model,
~ Instructor)
CLD =
cld(marginal,
alpha=0.05,
Letters=letters, ### Use lower-case
letters for .group
adjust="tukey") ### Tukey-adjusted
comparisons
CLD
Instructor lsmean SE df lower.CL upper.CL .group
Melissa Robins 1153.201 83.01880 1.24 476.7519 1829.650 a
Coach McGuirk 1216.799 83.01880 1.24 540.3498 1893.248 ab
Brendon Small 1280.137 82.60038 1.22 588.9293 1971.345 b
Confidence level used: 0.95
P value adjustment: tukey method for comparing a family of 3 estimates
significance level used: alpha = 0.05
pairs(marginal,
adjust="tukey")
$contrasts
contrast estimate SE df t.ratio
p.value
Brendon Small - Coach McGuirk 63.33828 45.93368 56.11 1.379 0.3587
Brendon Small - Melissa Robins 126.93620 46.73138 56.29 2.716 0.0234
Coach McGuirk - Melissa Robins 63.59792 48.62097 56.62 1.308 0.3967
P value adjustment: tukey method for comparing a family of 3 estimate
A similar analysis can be conducted with the difflsmeans function in the lmerTest package.
library(lmerTest)
difflsmeans(model,
test.effs="Instructor")
Differences of LSMEANS:
Estimate Standard Error DF t-value
Lower CI Upper CI p-value
Instructor Brendon Small - Coach McGuirk 63.3 45.8 56.1
1.38 -28.5 155 0.172
Instructor Brendon Small - Melissa Robins 126.9 46.5 56.3
2.73 33.9 220 0.008 **
Instructor Coach McGuirk - Melissa Robins 63.6 48.0 56.6
1.33 -32.5 160 0.190
The following code extracts a data frame from the difflsmeans output, and then adds a column of p-values adjusted by the fdr method. See ?p.adjust for other adjustment options.
comparison = difflsmeans(model,
test.effs="Instructor")[[1]]
p.value = comparison$"p-value"
comparison$p.adj = p.adjust(p.value,
method = "fdr")
comparison
Estimate Standard Error DF t-value Lower CI Upper CI p-value p.adj
Instructor Brendon Small - Coach McGuirk 63.3383 45.8366 56.1 1.38
-28.4807 155.1573 0.1725 0.1902
Instructor Brendon Small - Melissa Robins 126.9362 46.4659 56.3 2.73
33.8631 220.0093 0.0084 0.0252
Instructor Coach McGuirk - Melissa Robins 63.5979 47.9651 56.6 1.33
-32.4661 159.6619 0.1902 0.1902
Mixed-effects model with nlme
Mixed models can also be specified with the nlme package. The formula notation for this package is slightly different than for the lme4 package. The expression random=~1|Town indicates that Town should be treated as a random variable, with each level having its own intercept in the model.
The fixed effects in the model can be tested with the Anova function in the car package.
library(nlme)
model = lme(Sodium ~ Instructor, random=~1|Town,
data=Data,
method="REML")
library(car)
Anova(model)
Analysis of Deviance Table (Type II tests)
Chisq Df Pr(>Chisq)
Instructor 7.4827 2 0.02372 *
The random effects in the model can be tested by specifying a null model with only fixed effects and comparing it to the full model with anova. The nlme package has a function gls that creates model objects without random effects in a manner analogous to those specified with lme.
model.null = gls(Sodium ~ Instructor,
data=Data,
method="REML")
anova(model,
model.null)
Model df AIC BIC logLik Test L.Ratio p-value
model 1 5 749.9281 760.1434 -369.9641
model.null 2 4 758.4229 766.5951 -375.2114 1 vs 2 10.49476 0.0012
p-value and pseudo R-squared for model
By default, the nagelkerke function will compare an lme object to a null model with the fixed effects removed, but the random effects retained.
library(rcompanion)
nagelkerke(model)
$Models
Model: "lme.formula, Sodium ~ Instructor, Data, ~1 | Town, REML"
Null: "lme.formula, Sodium ~ 1, Data, ~1 | Town, REML"
$Pseudo.R.squared.for.model.vs.null
Pseudo.R.squared
McFadden 0.00971559
Cox and Snell (ML) 0.11818400
Nagelkerke (Cragg and Uhler) 0.11818400
$Likelihood.ratio.test
Df.diff LogLik.diff Chisq p.value
-2 -3.7732 7.5463 0.02298
$Messages
[1] "Note: For models fit with REML, these statistics are based on
refitting with ML"
Another approach to determining the p-value and pseudo R-squared for an lme model is to compare the model to a null model with only an intercept and neither the fixed nor the random effects. To accomplish this, the null model has to be specified with the gls function.
model.null.2 = gls(Sodium ~ 1,
data=Data,
method="REML")
library(rcompanion)
nagelkerke(fit = model,
null = model.null.2)
$Pseudo.R.squared.for.model.vs.null
Pseudo.R.squared
McFadden 0.0239772
Cox and Snell (ML) 0.2701590
Nagelkerke (Cragg and Uhler) 0.2701600
$Likelihood.ratio.test
Df.diff LogLik.diff Chisq p.value
-3 -9.4479 18.896 0.00028731
Post-hoc analysis
The lsmeans package is able to handle lme objects. For futher details, see ?lsmeans::models. For a review of mean separation tests and least square means, see the chapters What are Least Square Means?; the chapter Least Square Means for Multiple Comparisons; and the “Post-hoc analysis: mean separation tests” section in the One-way ANOVA chapter.
library(multcompView)
library(lsmeans)
marginal = lsmeans(model,
~ Instructor)
CLD =
cld(marginal,
alpha = 0.05,
Letters = letters, ### Use lower-case
letters for .group
adjust = "tukey") ### Tukey-adjusted
comparisons
CLD
Instructor lsmean SE df lower.CL upper.CL .group
Melissa Robins 1153.201 82.92335 1 99.55992 2206.842 a
Coach McGuirk 1216.799 82.92335 1 163.15784 2270.440 ab
Brendon Small 1280.137 82.59437 1 230.67625 2329.598 b
Confidence level used: 0.95
P value adjustment: tukey method for comparing a family of 3 estimates
significance level used: alpha = 0.05
pairs(marginal,
adjust="tukey")
contrast estimate SE df t.ratio p.value
Brendon Small - Coach McGuirk 63.33828 45.83662 56 1.382 0.3572
Brendon Small - Melissa Robins 126.93620 46.46588 56 2.732 0.0225
Coach McGuirk - Melissa Robins 63.59792 47.96514 56 1.326 0.3869
P value adjustment: tukey method for comparing a family of 3 estimate
Comparison of results from one-way ANOVA with blocks
Just for interest, a comparison of results from this chapter and the previous chapter are presented here. There is some variation in the values obtained for these statistics.
Fixed lmer lme
p-value for Instructor 0.034 0.030 0.024
p-value for Town 0.00019 0.001 0.0012
Mean separation a a a
for Instructor ab ab ab
b b b
Overall p-value < 0.0001 0.00024 0.00029
R-square or 0.348 0.270 0.270
Pseudo R-square
(Nagelkerke)