This book will not investigate the concept of random effects
in models in any substantial depth. The goal of this chapter is to empower the
reader to include random effects in models in cases of paired data or repeated
measures.

##### Random effects in models for paired and repeated measures

As an example, if we are measuring the left hand and right hand
of several individuals, the measurements are paired within each individual.
That is, we want to statistically match the left hand of *Individual A* to
the right hand of *Individual A*, since we suppose that someone with a
large left hand will have a large right hand. Therefore, the variable *Individual*
will be included in the model as a random variable. Each *Individual *could
be thought of has a block including a measurement for the left hand and a measurement
for the right hand.

As a second example, imagine we are measuring the scores of
instructors by raters multiple times, and we can match each rater’s score at
each time to the same rater’s score at the other times. So, we want to
statistically match the score of *Rater A* at *Time 1* to the score
of *Rater A* at *Time 2*, since we suppose that someone who scores an
instructor high at one time might score an instructor high at another time.
Therefore, the variable *Rater* will be included in the model as a random
variable. Each *Rater *could be thought of as a block including a
measurement for each of the times.

##### The concept of random variables

Getting a firm grasp of the concept of random variables is
not an easy task, so it is okay to not fret too much about completely
understanding the concept at this point. It is also helpful to keep in mind
that whether a variable should be considered a fixed or random variable will
depend on the interpretation of the person doing the analysis.

In the previous chapter, the variable *Speaker* was
used as a *fixed effect* in the model. Conceptually, the idea is that we
are interested in the effect of each of the specific levels of that variable.
That is, we care specifically about Pooh and Piglet and their scores.

In many cases of blocking variables, though, we don’t
necessarily care about the values for those specific blocks.

For example, if we conducted a study in two different
schools focusing on two different curricula, we may want to know about the
effect of the specific curricula, but chose the schools more or less at random.
We don’t really care if Springfield Elementary had a higher score than Shelbyville Elementary. The two schools represent any two
random schools, not those two specific schools. But we definitely want to
include the *School* effect in the model, because we suppose that one
school could do better with both curricula than would the other school.

Another example of this is when we have instructors rated by
student raters. If we are studying the performance of the instructors, we
don’t necessarily care about how Nelson or Milhouse or Ralph as individuals
rate instructors. They are representing students chosen at random. We just
need to include the effect of the variable *Rater* in the model to
statistically account for the fact that each rater might tend to rate all
instructors lower or higher than other raters.

In these examples, *School* and *Rater* could be
included in their respective models as random effects.

##### Mixed effects models

When a model includes both fixed effects and random effects,
it is called a *mixed effects* model. Or the term *hierarchical model*
may be used.

##### Optional technical note: Random effects in more complex models

For more complex models, specifying random effects can become difficult. Random effects can be crossed with one another or can be nested within one another. Also, correlation structures for the random effects can be specified. However, approaching this complexity is beyond the scope of this book.

The examples in this book treat random variables as simple
intercept-only variables, and as simple blocks that are not nested within other
variables.

### Packages used in this chapter

The packages used in this chapter include:

• FSA

• psych

• lme4

• lmerTest

• nlme

• car

The following commands will install these packages if they are not already installed:

if(!require(FSA)){install.packages("FSA")}

if(!require(psych)){install.packages("psych")}

if(!require(lme4)){install.packages("lme4")}

if(!require(lmerTest)){install.packages("lmerTest")}

if(!require(nlme)){install.packages("nlme")}

if(!require(car)){install.packages("car")}

### An example of a mixed model

For this example, we will revisit the hand size example from
the *Independent and Paired Values* chapter. The variable hand *Length*
will be treated as an interval/ratio variable.

Note that the model includes the blocking variable, *Individual*,
and so the data do not need to be in a certain order to match the paired
observations.

Also note that mixed models may make certain assumptions
about the distributions of the data. For simplicity, this example will not
check if those assumption are met.

By default, an analysis of variance for a mixed model
doesn’t test the significance of the random effects in the model. However, the
effect of random terms can be tested by comparing the model to a model
including only the fixed effects and excluding the random effects, or with the *ranova*
function from the *lmerTest* package if the *lme4* package is used to
specify the model.

As a technical note, the *lmerTest* package has options
to use Satterthwaite or Kenward–Roger degrees of freedom, and options for
type-III or type-II tests in the analysis of variance, if the *lme4*
package is used to specify the model.

Data = read.table(header=TRUE, stringsAsFactors=TRUE, text="

Individual Hand Length

A Left 17.5

B Left 18.4

C Left 16.2

D Left 14.5

E Left 13.5

F Left 18.9

G Left 19.5

H Left 21.1

I Left 17.8

J Left 16.8

K Left 18.4

L Left 17.3

M Left 18.9

N Left 16.4

O Left 17.5

P Left 15.0

A Right 17.6

B Right 18.5

C Right 15.9

D Right 14.9

E Right 13.7

F Right 18.9

G Right 19.5

H Right 21.5

I Right 18.5

J Right 17.1

K Right 18.9

L Right 17.5

M Right 19.5

N Right 16.5

O Right 17.4

P Right 15.6

")

### Check the data frame

library(psych)

headTail(Data)

str(Data)

summary(Data)

#### Mixed model with lmer

One way to construct a mixed effects model for interval/ratio
data is with the *lmer* function in the *lme4 *package. The *lmerTest*
package is used to produce an analysis of variance with *p*-values for
model effects.

Notice the grammar in the *lmer* function that defines
the model: the term *(1|Individual)* is added to the model to indicate
that *Individual* is the random term.

As a technical note, the *1* indicates that an
intercept is to be fitted for each level of the random variable.

As another technical note, *REML* stands for *restricted
maximum likelihood*. It is a method of fitting the model and is often
considered better than fitting with a conventional *ML* (*maximum
likelihood*) method.

##### Define model and conduct analysis of variance

library(lme4)

library(lmerTest)

model = lmer(Length ~ Hand + (1|Individual),

data=Data,

REML=TRUE)

anova(model)

Analysis of Variance Table of type III with Satterthwaite

approximation for degrees of freedom

Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)

Hand 0.45125 0.45125 1 15 11.497 0.004034 **

##### Test the random effects in the model

The *ranova* function from the *lmerTest* package
will test the random effects in the model.

ranova(model)

ANOVA-like table for random-effects: Single term deletions

npar logLik AIC LRT Df Pr(>Chisq)

<none> 4 -36.313 80.626

(1 | Individual) 3 -65.532 137.064 58.438 1 2.098e-14 ***

#### Mixed model with nlme

Another way to construct a mixed effects model for
interval/ratio data is with the *lme* function in the *nlme *package.

Notice the grammar in the *lme *function that defines
the model: the option *random=~1|Individual* is added to the model to
indicate that *Individual* is the random term.

As a technical note, the *1* indicates that an
intercept is to be fitted for each level of the random variable.

As another technical note, *REML* stands for *restricted
maximum likelihood*. It is a method of fitting the model, and is often
considered better than fitting with a conventional *ML* (*maximum
likelihood*) method.

##### Define model and conduct analysis of deviance

library(nlme)

model = lme(Length ~ Hand, random=~1|Individual,

data=Data,

method="REML")

library(car)

Anova(model)

Analysis of Deviance Table (Type II tests)

Chisq Df Pr(>Chisq)

Hand 11.497 1 0.0006972 ***

##### Test the random effects in the model

The random effects in the model can be tested by comparing
the model to a model fitted with just the fixed effects and excluding the
random effects. Because there are not random effects in this second model, the
*gls* function in the *nlme* package is used to fit this model.

We will use a similar method for cumulative link models.

model.fixed = gls(Length ~ Hand,

data=Data,

method="REML")

anova(model,

model.fixed)

Model df AIC BIC logLik Test L.Ratio p-value

model 1 4 80.62586 86.23065 -36.31293

model.fixed 2 3 137.06356 141.26715 -65.53178 1 vs 2 58.4377 <.0001