## Summary and Analysis of Extension Program Evaluation in R

Salvatore S. Mangiafico

# One-way ANOVA with Random Blocks

This chapter reproduces the example from the previous chapter.  The reader should consult that chapter for an explanation of one-way analysis of variance with blocks.  Here, the analysis is done with a mixed effects model, with the treatments treated as a fixed effect and the blocks treated as a random effect.

In analysis of variance, blocking variables are often treated as random variables.  This is because the blocking variable represents a random selection of levels of that variable.  The analyst wants to take the effects of the blocking variable into account, but the identity of the specific levels of the blocks are not of interest.

In the example in this chapter, the instructors are focused on the effect of their different nutrition education programs, which are the treatments; they are not concerned about the effect of one specific town or another per se, but do want to want to take into account any differences due to the different towns.

For a more complete discussion of random effects, see the Using Random Effects in Models chapter.

### Packages used in this chapter

The packages used in this chapter include:

•  psych

•  lme4

•  lmerTest

•  multcompView

•  lsmeans

•  nlme

•  car

•  rcompanion

The following commands will install these packages if they are not already installed:

if(!require(psych)){install.packages("psych")}
if(!require(lme4)){install.packages("lme4")}
if(!require(lmerTest)){install.packages("lmerTest")}
if(!require(multcompView)){install.packages("multcompView")}
if(!require(lsmeans)){install.packages("lsmeans")}
if(!require(nlme)){install.packages("nlme")}
if(!require(car)){install.packages("car")}
if(!require(rcompanion)){install.packages("rcompanion")}

### One-way ANOVA with random blocks example

Input = ("
Instructor        Town             Sodium
'Brendon Small'   Squiggleville    1200
'Brendon Small'   Squiggleville    1400
'Brendon Small'   Squiggleville    1350
'Brendon Small'   Metalocalypse     950
'Brendon Small'   Squiggleville    1400
'Brendon Small'   Squiggleville    1150
'Brendon Small'   Squiggleville    1300
'Brendon Small'   Metalocalypse    1325
'Brendon Small'   Metalocalypse    1425
'Brendon Small'   Squiggleville    1500
'Brendon Small'   Squiggleville    1250
'Brendon Small'   Metalocalypse    1150
'Brendon Small'   Metalocalypse     950
'Brendon Small'   Squiggleville    1150
'Brendon Small'   Metalocalypse    1600
'Brendon Small'   Metalocalypse    1300
'Brendon Small'   Metalocalypse    1050
'Brendon Small'   Metalocalypse    1300
'Brendon Small'   Squiggleville    1700
'Brendon Small'   Squiggleville    1300
'Coach McGuirk'   Squiggleville    1100
'Coach McGuirk'   Squiggleville    1200
'Coach McGuirk'   Squiggleville    1250
'Coach McGuirk'   Metalocalypse    1050
'Coach McGuirk'   Metalocalypse    1200
'Coach McGuirk'   Metalocalypse    1250
'Coach McGuirk'   Squiggleville    1350
'Coach McGuirk'   Squiggleville    1350
'Coach McGuirk'   Squiggleville    1325
'Coach McGuirk'   Squiggleville    1525
'Coach McGuirk'   Squiggleville    1225
'Coach McGuirk'   Squiggleville    1125
'Coach McGuirk'   Metalocalypse    1000
'Coach McGuirk'   Metalocalypse    1125
'Coach McGuirk'   Squiggleville    1400
'Coach McGuirk'   Metalocalypse    1200
'Coach McGuirk'   Squiggleville    1150
'Coach McGuirk'   Squiggleville    1400
'Coach McGuirk'   Squiggleville    1500
'Coach McGuirk'   Squiggleville    1200
'Melissa Robins'  Metalocalypse     900
'Melissa Robins'  Metalocalypse    1100
'Melissa Robins'  Metalocalypse    1150
'Melissa Robins'  Metalocalypse     950
'Melissa Robins'  Metalocalypse    1100
'Melissa Robins'  Metalocalypse    1150
'Melissa Robins'  Squiggleville    1250
'Melissa Robins'  Squiggleville    1250
'Melissa Robins'  Squiggleville    1225
'Melissa Robins'  Squiggleville    1325
'Melissa Robins'  Metalocalypse    1125
'Melissa Robins'  Metalocalypse    1025
'Melissa Robins'  Metalocalypse     950
'Melissa Robins'  Metalocalypse     925
'Melissa Robins'  Squiggleville    1200
'Melissa Robins'  Metalocalypse    1100
'Melissa Robins'  Metalocalypse     950
'Melissa Robins'  Metalocalypse    1300
'Melissa Robins'  Squiggleville    1400
'Melissa Robins'  Metalocalypse    1100
")

###  Order factors by the order in data frame
###  Otherwise, R will alphabetize them

Data\$Instructor = factor(Data\$Instructor,
levels=unique(Data\$Instructor))

Data\$Town       = factor(Data\$Town,
levels=unique(Data\$Town))

###  Check the data frame

library(psych)

str(Data)

summary(Data)

### Remove unnecessary objects

rm(Input)

#### Mixed-effects model with lmer

In this first example, the model will be specified with the lmer function in the package lme4.  The term (1|Town) in the model formula indicates that Town should be treated as a random variable, with each level having its own intercept in the model.  The anova function in the package lmerTest is used to produce p-values for the fixed effects.  The rand function in the package lmerTest produces p-values for the random effects.

##### Technical note

lmerTest::anova by default returns p-values for Type III sum of squares with a Satterthwaite approximation for the degrees of freedom.

library(lme4)

library(lmerTest)

model = lmer(Sodium ~ Instructor + (1|Town),
data=Data,
REML=TRUE)

anova(model)

Analysis of Variance Table of type III  with  Satterthwaite
approximation for degrees of freedom

Sum Sq Mean Sq NumDF DenDF F.value  Pr(>F)
Instructor 154766   77383     2 56.29  3.7413 0.02982 *

rand(model)

Analysis of Random effects Table:

Chi.sq Chi.DF p.value
Town   10.5      1   0.001 **

##### p-value and pseudo R-squared for model

To use the nagelkerke function with an lmer object, the null model with which to compare the lmer model has to be specified explicitly.

One approach is to define the null model as one with no fixed effects except for an intercept, indicated with a 1 on the right side of the ~.  And to also include the random effects, in this case (1|Town).

model.null = lmer(Sodium ~ 1 + (1|Town),
data = Data
,
REML = TRUE)

anova(model,
model.null)

Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
..1     3 782.34 788.63 -388.17   776.34
object  5 778.82 789.29 -384.41   768.82 7.5255      2    0.02322 *

library(rcompanion)

nagelkerke(fit  = model,
null = model.null
)

\$Pseudo.R.squared.for.model.vs.null
Pseudo.R.squared
Cox and Snell (ML)                 0.11818400
Nagelkerke (Cragg and Uhler)       0.11818400

Another approach to determining the p-value and pseudo R-squared for an lmer model is to compare the model to a null model with only an intercept and neither the fixed nor the random effects.  To accomplish this, the null model has to be specified with lm and not lmer.  However, it may not be permissible to compare models specified with different functions in this way.  The anova function in lmerTest does allow these comparisons.  The nagelkerke function will also allow these comparisons, although I do not know if the results are valid.

model.null.2 = lm(Sodium ~ 1,
data = Data)

anova(model,
model.null.2)

Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
..1     2 792.07 796.26 -394.04   788.07
object  5 778.82 789.29 -384.41   768.82 19.253      3  0.0002424 ***

library(rcompanion)

nagelkerke(fit  = model,
null = model.null.2
)

\$Pseudo.R.squared.for.model.vs.null
Pseudo.R.squared
Cox and Snell (ML)                  0.2701590
Nagelkerke (Cragg and Uhler)        0.2701600

##### Post-hoc analysis

The lsmeans package is able to handle lmer objects.  For further details, see ?lsmeans::models.  For a review of mean separation tests and least square means, see the chapters What are Least Square Means?; the chapter Least Square Means for Multiple Comparisons; and the “Post-hoc analysis:  mean separation tests” section in the One-way ANOVA chapter.

The lmerTest package has a function Difflsmeans which also compares marginal means for treatments, but doesn’t include output for a compact letter display or adjusted p-values.

library(multcompView)

library(lsmeans)

marginal = lsmeans(model,
~ Instructor)

CLD = cld(marginal,
alpha=0.05,
Letters=letters,        ### Use lower-case letters for .group

CLD

Instructor       lsmean       SE   df lower.CL upper.CL .group
Melissa Robins 1153.201 83.01880 1.24 476.7519 1829.650  a
Coach McGuirk  1216.799 83.01880 1.24 540.3498 1893.248  ab
Brendon Small  1280.137 82.60038 1.22 588.9293 1971.345   b

Confidence level used: 0.95
P value adjustment: tukey method for comparing a family of 3 estimates
significance level used: alpha = 0.05

pairs(marginal,

\$contrasts
contrast                        estimate       SE    df t.ratio p.value

Brendon Small - Coach McGuirk   63.33828 45.93368 56.11   1.379  0.3587
Brendon Small - Melissa Robins 126.93620 46.73138 56.29   2.716  0.0234
Coach McGuirk - Melissa Robins  63.59792 48.62097 56.62   1.308  0.3967

P value adjustment: tukey method for comparing a family of 3 estimate

A similar analysis can be conducted with the difflsmeans function in the lmerTest package.

library(lmerTest)

difflsmeans(model,
test.effs="Instructor")

Differences of LSMEANS:
Estimate Standard Error    DF t-value Lower CI Upper CI p-value
Instructor Brendon Small - Coach McGuirk      63.3           45.8  56.1    1.38    -28.5      155   0.172
Instructor Brendon Small - Melissa Robins    126.9           46.5  56.3    2.73     33.9      220   0.008 **
Instructor Coach McGuirk - Melissa Robins     63.6           48.0  56.6    1.33    -32.5      160   0.190

The following code extracts a data frame from the difflsmeans output, and then adds a column of p-values adjusted by the fdr method.  See ?p.adjust for other adjustment options.

comparison = difflsmeans(model,
test.effs="Instructor")[[1]]

p.value = comparison\$"p-value"

method = "fdr")

comparison

Estimate Standard Error   DF t-value Lower CI Upper CI p-value  p.adj
Instructor Brendon Small - Coach McGuirk   63.3383        45.8366 56.1    1.38 -28.4807 155.1573  0.1725 0.1902
Instructor Brendon Small - Melissa Robins 126.9362        46.4659 56.3    2.73  33.8631 220.0093  0.0084 0.0252
Instructor Coach McGuirk - Melissa Robins  63.5979        47.9651 56.6    1.33 -32.4661 159.6619  0.1902 0.1902

#### Mixed-effects model with nlme

Mixed models can also be specified with the nlme package.  The formula notation for this package is slightly different than for the lme4 package.  The expression random=~1|Town indicates that Town should be treated as a random variable, with each level having its own intercept in the model.

The fixed effects in the model can be tested with the Anova function in the car package.

library(nlme)

model = lme(Sodium ~ Instructor, random=~1|Town,
data=Data,
method="REML")

library(car)

Anova(model)

Analysis of Deviance Table (Type II tests)

Chisq Df Pr(>Chisq)
Instructor 7.4827  2    0.02372 *

The random effects in the model can be tested by specifying a null model with only fixed effects and comparing it to the full model with anova.  The nlme package has a function gls that creates model objects without random effects in a manner analogous to those specified with lme.

model.null = gls(Sodium ~ Instructor,
data=Data,
method="REML")

anova(model,
model.null)

Model df      AIC      BIC    logLik   Test  L.Ratio p-value
model          1  5 749.9281 760.1434 -369.9641
model.null     2  4 758.4229 766.5951 -375.2114 1 vs 2 10.49476  0.0012

##### p-value and pseudo R-squared for model

By default, the nagelkerke function will compare an lme object to a null model with the fixed effects removed, but the random effects retained.

library(rcompanion)

nagelkerke(model)

\$Models

Model: "lme.formula, Sodium ~ Instructor, Data, ~1 | Town, REML"
Null:  "lme.formula, Sodium ~ 1, Data, ~1 | Town, REML"

\$Pseudo.R.squared.for.model.vs.null
Pseudo.R.squared
Cox and Snell (ML)                 0.11818400
Nagelkerke (Cragg and Uhler)       0.11818400

\$Likelihood.ratio.test
Df.diff LogLik.diff  Chisq p.value
-2     -3.7732 7.5463 0.02298

\$Messages
[1] "Note: For models fit with REML, these statistics are based on refitting with ML"

Another approach to determining the p-value and pseudo R-squared for an lme model is to compare the model to a null model with only an intercept and neither the fixed nor the random effects.  To accomplish this, the null model has to be specified with the gls function.

model.null.2 = gls(Sodium ~ 1,
data=Data,
method="REML")

library(rcompanion)

nagelkerke(fit  = model,
null = model.null.2
)

\$Pseudo.R.squared.for.model.vs.null
Pseudo.R.squared
Cox and Snell (ML)                  0.2701590
Nagelkerke (Cragg and Uhler)        0.2701600

\$Likelihood.ratio.test
Df.diff LogLik.diff  Chisq    p.value
-3     -9.4479 18.896 0.00028731

##### Post-hoc analysis

The lsmeans package is able to handle lme objects.  For futher details, see ?lsmeans::models.  For a review of mean separation tests and least square means, see the chapters What are Least Square Means?; the chapter Least Square Means for Multiple Comparisons; and the “Post-hoc analysis:  mean separation tests” section in the One-way ANOVA chapter.

library(multcompView)

library(lsmeans)

marginal = lsmeans(model,
~ Instructor)

CLD = cld(marginal,
alpha   = 0.05,
Letters = letters,     ### Use lower-case letters for .group

CLD

Instructor       lsmean       SE df  lower.CL upper.CL .group
Melissa Robins 1153.201 82.92335  1  99.55992 2206.842  a
Coach McGuirk  1216.799 82.92335  1 163.15784 2270.440  ab
Brendon Small  1280.137 82.59437  1 230.67625 2329.598   b

Confidence level used: 0.95
P value adjustment: tukey method for comparing a family of 3 estimates
significance level used: alpha = 0.05

pairs(marginal,

contrast                        estimate       SE df t.ratio p.value
Brendon Small - Coach McGuirk   63.33828 45.83662 56   1.382  0.3572
Brendon Small - Melissa Robins 126.93620 46.46588 56   2.732  0.0225
Coach McGuirk - Melissa Robins  63.59792 47.96514 56   1.326  0.3869

P value adjustment: tukey method for comparing a family of 3 estimate

### Comparison of results from one-way ANOVA with blocks

Just for interest, a comparison of results from this chapter and the previous chapter are presented here.  There is some variation in the values obtained for these statistics.

Fixed        lmer        lme
p-value for Instructor   0.034        0.030       0.024

p-value for Town         0.00019      0.001       0.0012

Mean separation          a             a          a
for Instructor           ab            ab         ab
b             b          b

Overall p-value          < 0.0001     0.00024     0.00029

R-square or              0.348        0.270       0.270
Pseudo R-square
(Nagelkerke)