This book will not investigate the concept of random effects
in models in any substantial depth. The goal of this chapter is to empower the
reader to include random effects in models in cases of paired data or repeated
measures.

##### Random effects in models for paired and repeated measures

As an example, if we are measuring the left hand and right
of several individuals, the measurements are paired within each individual.
That is, we want to statistically match the left hand of *Individual A* to
the right hand of *Individual A*, since we suppose that someone with a
large left hand will have a large right hand. Therefore, the variable *Individual*
will be included in the model as a random variable. Each *Individual *could
be thought of has a block including a measurement for the left hand and a
measurement for the right hand.

As a second example, imagine we are measuring the scores of instructors
multiple times, and we can match each rater’s score at each time to the same
rater’s score at the other times. So, we want to statistically match the score
of *Rater A* at *Time 1* to the score of *Rater A* at *Time 2*,
since we suppose that someone who scores an instructor high at one time might
score an instructor high at another time. Therefore, the variable *Rater*
will be included in the model as a random variable. Each *Rater *could be
thought of as a block including a measurement for each of the times.

##### The concept of random variables

Getting a firm grasp of the concept of random variables is
not an easy task, so it is okay to not fret too much about completely
understanding the concept at this point. It is also helpful to keep in mind
that whether a variable should be considered a fixed or random variable will
depend on the interpretation of the person doing the analysis.

In the previous chapter, the variable *Speaker* was
used as a *fixed effect* in the model. Conceptually, the idea is that we
are interested in the effect of each of the specific levels of that variable.
That is, we care specifically about Pooh and Piglet and their scores.

In many cases of blocking variables, though, we don’t
necessarily care about the values for those specific blocks.

For example, if we conducted a study in two different
schools focusing on two different curricula, we may want to know about the effect
of the curricula, but more or less chose the schools at random. We don’t
really care if Springfield Elementary had a higher score than Shelbyville
Elementary. The two schools represent any two random schools, not those two
specific schools. But we definitely want to include the *School* effect
in the model, because we suppose that one school could do better with both
curricula than would the other school.

Another example of this is when we have instructors rated by
student raters. If we are studying the performance of the instructors, we
don’t necessarily care about how Nelson or Milhouse or Ralph as individuals rate
instructors. They are representing students chosen at random. We just need to
include the effect of the variable *Rater* in the model to statistically account
for the fact that each rater might tend to rate all instructors lower or higher
than other raters.

In these examples, *School* and *Rater* could be
included in their respective models as random effects.

##### Mixed effects models

When a model includes both fixed effects and random effects,
it is called a *mixed effects* model.

##### Optional technical note: Random effects in more complex models

For more complex models, specifying random effects can become difficult. Random effects can be crossed with one another or can be nested within one another. Also, correlation structures for the random effects can be specified. However, approaching this complexity is beyond the scope of this book.

The examples in this book treat random variables as simple
intercept-only variables, as simple blocks that are not nested within other
variables.

### Packages used in this chapter

The packages used in this chapter include:

• FSA

• psych

• lme4

• lmerTest

• nlme

• car

The following commands will install these packages if they are not already installed:

if(!require(FSA)){install.packages("FSA")}

if(!require(psych)){install.packages("psych")}

if(!require(lme4)){install.packages("lme4")}

if(!require(lmerTest)){install.packages("lmerTest")}

if(!require(nlme)){install.packages("nlme")}

if(!require(car)){install.packages("car")}

### An example of a mixed model

For this example we will revisit the hand size example from
the *Independent and Paired Values* chapter. The variable hand *Length*
will be treated as an interval/ratio variable. We will return to using cumulative
link models for Likert data in subsequent chapters.

Note that the model includes the blocking variable, *Individual*,
and so data do not need to be in a certain order to match the paired observations.

Also note that mixed models may make certain assumptions
about the distributions of the data. For simplicity, this example will not
check if those assumption are met.

By default, an analysis of variance for a mixed model
doesn’t test the significance of the random effects in the model. However, the
effect of random terms can be tested by comparing the model to a model
including only the fixed effects and excluding the random effects, or with the *rand*
function from the *lmerTest* package if the *lme4* package is used to
specify the model.

As a technical note, the *lmerTest* package has options
to use Satterthwaite or Kenward–Roger degrees of freedom, and options for
type-III or type-II tests in the analysis of variance, if the *lme4* package
is used to specify the model.

Input = ("

Individual Hand Length

A Left 17.5

B Left 18.4

C Left 16.2

D Left 14.5

E Left 13.5

F Left 18.9

G Left 19.5

H Left 21.1

I Left 17.8

J Left 16.8

K Left 18.4

L Left 17.3

M Left 18.9

N Left 16.4

O Left 17.5

P Left 15.0

A Right 17.6

B Right 18.5

C Right 15.9

D Right 14.9

E Right 13.7

F Right 18.9

G Right 19.5

H Right 21.5

I Right 18.5

J Right 17.1

K Right 18.9

L Right 17.5

M Right 19.5

N Right 16.5

O Right 17.4

P Right 15.6

")

Data = read.table(textConnection(Input),header=TRUE)

### Check the data frame

library(psych)

headTail(Data)

str(Data)

summary(Data)

### Remove unnecessary objects

rm(Input)

#### Mixed model with lmer

One way to construct a mixed effects model for
ratio/interval data is with the *lmer* function in the *lme4 *package.
The *lmerTest* package is used to produce an analysis of variance with *p*-values
for model effects.

Notice the grammar in the *lmer* function that defines
the model: the term *(1|Individual)* is added to the model to indicate
that *Individual* is the random term.

As a technical note, the *1* indicates that an
intercept is to be fitted for each level of the random variable.

As another technical note, *REML* stands for *restricted
maximum likelihood*. It is a method of fitting the model, and is often
considered better than fitting with a conventional *ML* (*maximum
likelihood*) method.

##### Define model and conduct analysis of variance

library(lme4)

library(lmerTest)

model = lmer(Length ~ Hand + (1|Individual),

data=Data,

REML=TRUE)

anova(model)

Analysis of Variance Table of type III with Satterthwaite

approximation for degrees of freedom

Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)

Hand 0.45125 0.45125 1 15 11.497 0.004034 **

##### Test the random effects in the model

The *rand* function from the *lmerTest* package
will test the random effects in the model.

rand(model)

Analysis of Random effects Table

Chi.sq Chi.DF p.value

Individual 58.4 1 2e-14 ***

#### Mixed model with nlme

Another way to construct a mixed effects model for
interval/ratio data is with the *lme* function in the *nlme *package.

Notice the grammar in the *lme *function that defines
the model: the option *random=~1|Individual* is added to the model to
indicate that *Individual* is the random term.

As a technical note, the *1* indicates that an
intercept is to be fitted for each level of the random variable.

As another technical note, *REML* stands for *restricted
maximum likelihood*. It is a method of fitting the model, and is often
considered better than fitting with a conventional *ML* (*maximum
likelihood*) method.

##### Define model and conduct analysis of deviance

library(nlme)

model = lme(Length ~ Hand, random=~1|Individual,

data=Data,

method="REML")

library(car)

Anova(model)

Analysis of Deviance Table (Type II tests)

Chisq Df Pr(>Chisq)

Hand 11.497 1 0.0006972 ***

##### Test the random effects in the model

The random effects in the model can be tested by comparing
the model to a model fitted with just the fixed effects and excluding the
random effects. Because there are not random effects in this second model, the
*gls* function in the *nlme* package is used to fit this model.

We will use a similar method for cumulative link models.

model.fixed = gls(Length ~ Hand,

data=Data,

method="REML")

anova(model,

model.fixed)

Model df AIC BIC logLik Test L.Ratio p-value

model 1 4 80.62586 86.23065 -36.31293

model.fixed 2 3 137.06356 141.26715 -65.53178 1 vs 2 58.4377 <.0001