R Companion: Analysis of Covariance

When to use it

The cricket example is shown in the “How to do the test” section.

Null hypotheses

Assumptions

How the test works

Examples

Graphing the results

Similar tests

See the Handbook for information on these topics.

How to do the test

Analysis of covariance example with two categories and type II sum of squares

This example uses type II sum of squares, but otherwise follows the example in the Handbook. The parameter estimates are calculated differently in R, so the calculation of the intercepts of the lines is slightly different.

### --------------------------------------------------------------
### Analysis of covariance, cricket example
### pp. 228–229
### --------------------------------------------------------------

Input = ("
Species   Temp   Pulse
ex       20.8   67.9
ex       20.8   65.1
ex       24     77.3
ex       24     78.7
ex       24     79.4
ex       24     80.4
ex       26.2   85.8
ex       26.2   86.6
ex       26.2   87.5
ex       26.2   89.1
ex       28.4   98.6
ex       29    100.8
ex       30.4   99.3
ex       30.4 101.7
niv      17.2   44.3
niv      18.3   47.2
niv      18.3   47.6
niv      18.3   49.6
niv      18.9   50.3
niv      18.9   51.8
niv      20.4   60
niv      21     58.5
niv      21     58.9
niv      22.1   60.7
niv      23.5   69.8
niv      24.2   70.9
niv      25.9   76.2
niv      26.5   76.1
niv      26.5   77
niv      26.5   77.7
niv      28.6   84.7
")

Data = read.table(textConnection(Input),header=TRUE)

Simple plot

plot(x   = Data$Temp,
     y   = Data$Pulse,
     col = Data$Species,
     pch = 16,
     xlab = "Temperature",
     ylab = "Pulse")

legend('bottomright',
       legend = levels(Data$Species),
      col = 1:2,
       cex = 1,
       pch = 16)

Analysis of covariance

options(contrasts = c("contr.treatment", "contr.poly"))

   ### These are the default contrasts in R

model.1 = lm (Pulse ~ Temp + Species + Temp:Species,
              data = Data)

library(car)

Anova(model.1, type="II")

Anova Table (Type II tests)

Sum Sq Df F value Pr(>F)

Temp 4376.1 1 1388.839 < 2.2e-16 ***

Species 598.0 1 189.789 9.907e-14 ***

Temp:Species 4.3 1 1.357 0.2542

### Interaction is not significant, so the slope across groups

### is not different.

model.2 = lm (Pulse ~ Temp + Species,
data = Data)

library(car)

Anova(model.2, type="II")

Anova Table (Type II tests)

Sum Sq Df F value Pr(>F)

Temp 4376.1 1 1371.4 < 2.2e-16 ***

Species 598.0 1 187.4 6.272e-14 ***

### The category variable (Species) is significant,

### so the intercepts among groups are different

summary(model.2)

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -7.21091 2.55094 -2.827 0.00858 **

Temp 3.60275 0.09729 37.032 < 2e-16 ***

Speciesniv -10.06529 0.73526 -13.689 6.27e-14 ***

### Note that these estimates are different than in the Handbook,

### but the calculated results will be identical.

### The slope estimate is the same.

### The intercept for species 1 (ex) is (intercept).

### The intercept for species 2 (niv) is (intercept) + Speciesniv.

### This is determined from the contrast coding of the Species

### variable shown below, and the fact that Speciesniv is shown in

### coefficient table above.

contrasts(Data$Species)

niv

ex 0

niv 1

Simple plot with fitted lines

I.nought = -7.21091
I1 = I.nought + 0
I2 = I.nought + -10.06529
B = 3.60275

plot(x   = Data$Temp,
     y   = Data$Pulse,
     col = Data$Species,
     pch = 16,
     xlab = "Temperature",
     ylab = "Pulse")

legend('bottomright',
       legend = levels(Data$Species),
       col = 1:2,
       cex = 1,
       pch = 16)

abline(I1, B,
       lty=1, lwd=2, col = 1)

abline(I2, B,
       lty=1, lwd=2, col = 2)

RPlot

p-value and R-squared of combined model

summary(model.2)

Multiple R-squared: 0.9896, Adjusted R-squared: 0.9888

F-statistic: 1331 on 2 and 28 DF, p-value: < 2.2e-16

Checking assumptions of the model

hist(residuals(model.2),
col="darkgray")

RPlot

A histogram of residuals from a linear model. The distribution of these residuals should be approximately normal.

plot(fitted(model.2),
residuals(model.2))

RPlot

A plot of residuals vs. predicted values. The residuals should be unbiased and homoscedastic. For an illustration of these properties, see this diagram by Steve Jost at DePaul University: condor.depaul.edu/sjost/it223/documents/resid-plots.gif.

### additional model checking plots with: plot(model.2)
### alternative: library(FSA); residPlot(model.2)

# # #

Analysis of covariance example with three categories and type II sum of squares

This example uses type II sum of squares, and considers a case with three groups.

### --------------------------------------------------------------
### Analysis of covariance, hypothetical data
### --------------------------------------------------------------

Input = ("
Species   Temp   Pulse
ex       20.8   67.9
ex       20.8   65.1
ex       24     77.3
ex       24     78.7
ex       24    79.4
ex       24     80.4
ex       26.2   85.8
ex       26.2   86.6
ex       26.2   87.5
ex       26.2   89.1
ex       28.4   98.6
ex       29    100.8
ex       30.4   99.3
ex       30.4 101.7
niv      17.2   44.3
niv      18.3   47.2
niv      18.3   47.6
niv      18.3   49.6
niv      18.9   50.3
niv      18.9   51.8
niv      20.4   60
niv      21     58.5
niv      21     58.9
niv      22.1   60.7
niv      23.5   69.8
niv      24.2   70.9
niv      25.9   76.2
niv      26.5   76.1
niv      26.5   77
niv      26.5   77.7
niv      28.6   84.7
fake     17.2   74.3
fake     18.3   77.2
fake     18.3   77.6
fake     18.3   79.6
fake     18.9   80.3
fake     18.9   81.8
fake     20.4   90
fake     21    88.5
fake     21     88.9
fake     22.1   90.7
fake     23.5   99.8
fake     24.2   100.9
fake     25.9   106.2
fake     26.5   106.1
fake     26.5   107
fake     26.5   107.7
fake     28.6   114.7
")

Data = read.table(textConnection(Input),header=TRUE)