Contrasts can be used to make specific comparisons of treatments within a linear model.
One common use is when a factorial design is used, but control or check treatments are used in addition to the factorial design. In the first example below, there are two treatments (D and C) each at two levels (1 and 2), and then there is a Control treatment. The approach used here is to analyze the experiment as a one-way analysis of variance, and then use contrasts to test various hypotheses.
Another common use is when there are several treatments that could be thought of as members of a group. In the second example below, there are measurements for six wines, some of which are red (Merlot, Cabernet, Syrah) and some of which are white (Chardonnay, Riesling, Gewürztraminer). We can compare the treatments within the red wine group by setting up contrasts and conducting an F-test. This is analogous to testing the main effect of Red Wine.
The packages lsmeans and multcomp allow for unlimited tests of single-degree contrasts, with a p-value correction for multiple tests. They also allow for an F-test for multi-line contrasts, for example when testing within groups. The aov function in the native stats package has more limited functionality.
See the chapters on One-way Anova and Two-way Anova for general considerations on conducting analysis of variance
Packages used in this chapter
The following commands will install these packages if they are not already installed:
if(!require(car)){install.packages("car")}
if(!require(lsmeans){install.packages("lsmeans")}
if(!require(multcomp)){install.packages("multcomp")}
Example for single degree-of-freedom contrasts
This hypothetical example could represent an experiment with a factorial design two treatments (D and C) each at two levels (1 and 2), and a control treatment. The 2-by-2 factorial plus control is treated as a one-way anova with five treatments.
Input = ("
Treatment Response
'D1:C1' 1.0
'D1:C1' 1.2
'D1:C1' 1.3
'D1:C2' 2.1
'D1:C2' 2.2
'D1:C2' 2.3
'D2:C1' 1.4
'D2:C1' 1.6
'D2:C1' 1.7
'D2:C2' 2.5
'D2:C2' 2.6
'D2:C2' 2.8
'Control' 1.0
'Control' 0.9
'Control' 0.8
")
Data = read.table(textConnection(Input),header=TRUE)
### Specify the order of factor levels. Otherwise R will alphabetize them.
Data$Treatment = factor(Data$Treatment,
levels=unique(Data$Treatment))
Data
boxplot(Response ~ Treatment,
data = Data,
ylab="Response",
xlab="Treatment")
### Define linear model
model = lm(Response ~ Treatment,
data = Data)
library(car)
Anova(model, type="II")
summary(model)
Example with lsmeans
### You need to look at order of factor levels to
determine the contrasts
levels(Data$Treatment)
[1] "D1:C1" "D1:C2" "D2:C1"
"D2:C2" "Control"
library(lsmeans)
leastsquare = lsmeans(model, "Treatment")
Contrasts = list(D1vsD2 = c(1, 1, -1, -1, 0),
C1vsC2 = c(1, -1, 1, -1, 0),
InteractionDC = c(1, -1, -1, 1, 0),
C1vsC2forD1only = c(1, -1, 0, 0, 0),
C1vsC2forD2only = c(0, 0, 1, -1, 0),
TreatsvsControl = c(1, 1, 1, 1, -4),
T1vsC = c(1, 0, 0, 0, -1),
T2vsC = c(0, 1, 0, 0, -1),
T3vsC = c(0, 0, 1, 0, -1),
T4vsC = c(0, 0, 0, 1, -1))
### The column names match the order of levels
of the treatment variable
### The coefficients of each row sum to 0
contrast(leastsquare, Contrasts, adjust="sidak")
contrast estimate SE df t.ratio p.value
D1vsD2 -0.83333333 0.1549193 10 -5.379 0.0031
C1vsC2 -2.10000000 0.1549193 10 -13.555 <.0001
InteractionDC 0.03333333 0.1549193 10 0.215 1.0000
C1vsC2forD1only -1.03333333 0.1095445 10 -9.433 <.0001
C1vsC2forD2only -1.06666667 0.1095445 10 -9.737 <.0001
TreatsvsControl 3.96666667 0.3464102 10 11.451 <.0001
T1vsC 0.26666667 0.1095445 10 2.434 0.3011
T2vsC 1.30000000 0.1095445 10 11.867 <.0001
T3vsC 0.66666667 0.1095445 10 6.086 0.0012
T4vsC 1.73333333 0.1095445 10 15.823 <.0001
### Note that p-values are slightly different than
those from multcomp
### due to different adjustment methods. If “none” is chosen as
### the adjustment method for both procedures,
### p-values and other statistics will be the same.
### With adjust="none", results will be the same as
### the aov method.
Example with multcomp
### You need to look at order of factor levels to determine the contrasts
levels(Data$Treatment)
[1] "D1:C1" "D1:C2" "D2:C1"
"D2:C2" "Control"
Input = ("
Contrast.Name D1C2 D1C2 D2C1 D2C2 Control
D1vsD2 1 1 -1 -1 0
C1vsC2 1 -1 1 -1 0
InteractionDC 1 -1 -1 1 0
C1vsC2forD1only 1 -1 0 0 0
C1vsC2forD2only 0 0 1 -1 0
TreatsvsControl 1 1 1 1 -4
T1vsC 1 0 0 0 -1
T2vsC 0 1 0 0 -1
T3vsC 0 0 1 0 -1
T4vsC 0 0 0 1 -1
")
### The column names match the order of levels
of the treatment variable
### The coefficients of each row sum to 0
Matriz = as.matrix(read.table(textConnection(Input),
header=TRUE,
row.names=1))
Matriz
library(multcomp)
G = glht(model,
linfct = mcp(Treatment = Matriz))
G$linfct
summary(G,
test=adjusted("single-step"))
### Adjustment options: "none",
"single-step", "Shaffer",
### "Westfall", "free",
"holm", "hochberg",
### "hommel", "bonferroni",
"BH", "BY", "fdr"
Estimate Std. Error t value Pr(>|t|)
D1vsD2 == 0 -0.83333 0.15492 -5.379 0.00218 **
C1vsC2 == 0 -2.10000 0.15492 -13.555 < 0.001 ***
InteractionDC == 0 0.03333 0.15492 0.215 0.99938
C1vsC2forD1only == 0 -1.03333 0.10954 -9.433 < 0.001 ***
C1vsC2forD2only == 0 -1.06667 0.10954 -9.737 < 0.001 ***
TreatsvsControl == 0 3.96667 0.34641 11.451 < 0.001 ***
T1vsC == 0 0.26667 0.10954 2.434 0.17428
T2vsC == 0 1.30000 0.10954 11.867 < 0.001 ***
T3vsC == 0 0.66667 0.10954 6.086 < 0.001 ***
T4vsC == 0 1.73333 0.10954 15.823 < 0.001 ***
### With test=adjusted("none"), results will be the same as aov
method below.
# # #
Example for global F-test within a group of treatments
This example has treatments consisting of three red wines and three white wines. We will want to know if there is an effect of the treatments in the red wine group on the response variable, while keeping the individual identities of the wines in the Treatment variable. This approach is advantageous because post-hoc comparisons could still be made within the red wines, for example comparing Merlot to Cabernet.
Input = ("
Treatment Response
Merlot 5
Merlot 6
Merlot 7
Cabernet 8
Cabernet 9
Cabernet 10
Syrah 11
Syrah 12
Syrah 13
Chardonnay 1
Chardonnay 2
Chardonnay 3
Riesling 1
Riesling 2
Riesling 2
Gewürtztraminer 1
Gewürtztraminer 2
Gewürtztraminer 4
")
Data = read.table(textConnection(Input),header=TRUE)
### Specify the order of factor levels. Otherwise R
will alphabetize them.
Data$Treatment = factor(Data$Treatment,
levels=unique(Data$Treatment))
Data
boxplot(Response ~ Treatment,
data = Data,
ylab="Response",
xlab="Treatment")
### You need to look at order of
factor levels to determine the contrasts
levels(Data$Treatment)
[1]
"Merlot" "Cabernet" "Syrah"
"Chardonnay" "Riesling" "Gewürtztraminer"
### Define linear model
model = lm(Response ~ Treatment,
data = Data)
library(car)
Anova(model, type="II")
summary(model)
Tests of contrasts with lsmeans
Question: Is there an effect within red wine ?
library(lsmeans)
leastsquare = lsmeans(model, "Treatment")
Contrasts = list(Red_line1 = c(1, -1, 0, 0, 0, 0),
Red_line2 = c(0, 1, -1, 0, 0, 0))
### The column names match the order of levels
of the treatment variable
### The coefficients of each row sum to 0
Test = contrast(leastsquare, Contrasts)
test(Test, joint=TRUE)
df1 df2 F p.value
2 12 24.3 0.0001
### Note that two lines of contrasts resulted in one
hypothesis test
### using 2 degrees of freedom. This investigated the effect within
### a group of 3 treatments.
### Results are essentially the same as those from multcomp
Question: Is there an effect within white wine ?
library(lsmeans)
leastsquare = lsmeans(model, "Treatment")
Contrasts = list(White_line1 = c(0, 0, 0, 1, -1, 0),
White_line2 = c(0, 0, 0, 0, 1, -1))
### The column names match the order of levels
of the treatment variable
### The coefficients of each row sum to 0
Test = contrast(leastsquare, Contrasts)
test(Test, joint=TRUE)
df1 df2 F p.value
2 12 0.3 0.7462
### Note that two lines of contrasts resulted in one
hypothesis test
### using 2 degrees of freedom. This investigated the effect within
### a group of 3 treatments
### Results are the same as those from multcomp
Question: Is there a difference between red and white wines? And, mean separation for red wine
library(lsmeans)
leastsquare = lsmeans(model, "Treatment")
Contrasts = list(Red_vs_white = c( 1, 1, 1, -1, -1, -1),
Merlot_vs_Cab = c( 1, -1, 0, 0, 0, 0),
Cab_vs_Syrah = c( 0, 1, -1, 0, 0, 0),
Syrah_vs_Merlot = c(-1, 0, 1, 0, 0, 0))
### The column names match the order of levels
of the treatment variable
### The coefficients of each row sum to 0
contrast(leastsquare, Contrasts, adjust="sidak")
contrast estimate SE df t.ratio p.value
Red_vs_white 21 1.490712 12 14.087 <.0001
Merlot_vs_Cab -3 0.860663 12 -3.486 0.0179
Cab_vs_Syrah -3 0.860663 12 -3.486 0.0179
Syrah_vs_Merlot 6 0.860663 12 6.971 0.0001
### Note that p-values are slightly different than
those from multcomp
### due to different adjustment methods. If “none” is chosen as
### the adjustment method for both procedures,
### p-values and other statistics will be the same.
Tests of contrasts with multcomp
Question: Is there an effect within red wine ?
Input = "
Contrast Merlot Cabernet Syrah Chardonnay Riesling Gewürtztraminer
Red_line1 1 -1 0 0 0 0
Red_line2 0 1 -1 0 0 0
"
### Note: there are two lines of contrasts for a group of three
treatments
### The column names
match the order of levels of the treatment variable
### The coefficients of each row sum to 0
Matriz = as.matrix(read.table(textConnection(Input),
header=TRUE,
row.names=1))
Matriz
library(multcomp)
G = glht(model, linfct = mcp(Treatment = Matriz))
G$linfct
summary(G,
test = Ftest())
Global Test:
F DF1 DF2 Pr(>F)
1 24.3 2 12 6.029e-05
### Note that two lines of contrasts resulted in one hypothesis test
### using 2 degrees of freedom. This investigated the effect within
### a group of 3 treatments.
Question: Is there an effect within white wine ?
Input = "
Contrast Merlot Cabernet Syrah Chardonnay Riesling Gewürtztraminer
White_line1 0 0 0 1 -1 0
White_line2 0 0 0 0 1 -1
"
### Note: there are two lines of contrasts for a group of three
treatments
### The column names
match the order of levels of the treatment variable
### The coefficients of each row sum to 0
Matriz = as.matrix(read.table(textConnection(Input),
header=TRUE,
row.names=1))
Matriz
library(multcomp)
G = glht(model, linfct = mcp(Treatment = Matriz))
G$linfct
summary(G,
test = Ftest())
Global Test:
F DF1 DF2 Pr(>F)
1 0.3 2 12 0.7462
### Note that two lines of contrasts resulted in one hypothesis test
### using 2 degrees of freedom. This investigated the effect within
### a group of 3 treatments.
# # #
Question: Is there a difference between red and white wines? And, mean separation for red wine
Input = "
Contrast Merlot Cabernet Syrah Chardonnay Riesling
Gewürtztraminer
Red_vs_white 1 1 1 -1 -1 -1
Merlot_vs_Cab 1 -1 0 0 0 0
Cab_vs_Syrah 0 1 -1 0 0 0
Syrah_vs_Merlot -1 0 1 0 0 0
"
names match the order of levels of the
treatment variable
### The coefficients of each row sum to 0
Matriz = as.matrix(read.table(textConnection(Input),
header=TRUE,
row.names=1))
Matriz
library(multcomp)
G = glht(model,
linfct = mcp(Treatment = Matriz))
G$linfct
summary(G,
test=adjusted("single-step"))
### Adjustment options: "none",
"single-step", "Shaffer",
### "Westfall", "free",
"holm", "hochberg",
### "hommel", "bonferroni",
"BH", "BY", "fdr"
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
Red_vs_white == 0 21.0000 1.4907 14.087 <0.001 ***
Merlot_vs_Cab == 0 -3.0000 0.8607 -3.486 0.0157 *
Cab_vs_Syrah == 0 -3.0000 0.8607 -3.486 0.0156 *
Syrah_vs_Merlot == 0 6.0000 0.8607 6.971 <0.001 ***
(Adjusted p values reported -- single-step method)
### With test=adjusted("none"), results
will be the same as aov method below.
Tests of contrasts within aov
Another method to use single-degree-of-freedom contrasts within an anova is to use the split option within the summary function for an aov analysis. The number of degrees of freedom that a factor can be split into for contrast tests is limited.
Input =("
Treatment Response
'D1:C1' 1.0
'D1:C1' 1.2
'D1:C1' 1.3
'D1:C2' 2.1
'D1:C2' 2.2
'D1:C2' 2.3
'D2:C1' 1.4
'D2:C1' 1.6
'D2:C1' 1.7
'D2:C2' 2.5
'D2:C2' 2.6
'D2:C2' 2.8
'Control' 1.0
'Control' 0.9
'Control' 0.8
")
Data = read.table(textConnection(Input),header=TRUE)
Data$Treatment = factor(Data$Treatment, levels=unique(Data$Treatment))
### Specify the order of factor
levels. Otherwise R will alphabetize them.
Data
boxplot(Response ~ Treatment,
data = Data,
ylab="Response",
xlab="Treatment")
levels(Data$Treatment)
### You need to look at order of factor
levels to determine the contrasts
[1] "D1:C1" "D1:C2" "D2:C1"
"D2:C2" "Control"
### Define contrasts
D1vsD2 = c(1, 1, -1, -1, 0)
C1vsC2 = c(1, -1, 1, -1, 0)
InteractionDC = c(1, -1, -1, 1, 0)
TreatsvsControl = c(1, 1, 1, 1, -4)
Matriz = cbind(D1vsD2, C1vsC2,
InteractionDC, TreatsvsControl)
contrasts(Data$Treatment) = Matriz
CList = list("D1vsD2" = 1,
"C1vsC2" = 2,
"InteractionDC" = 3,
"TreatsvsControl" = 4)
### Define model and display summary
model = aov(Response ~ Treatment, data = Data)
summary(model,
split=list(Treatment=CList))
Df Sum Sq Mean Sq F value
Pr(>F)
Treatment 4 6.189 1.547 85.963 1.06e-07 ***
Treatment: D1vsD2 1 0.521 0.521 28.935 0.00031 ***
Treatment: C1vsC2 1 3.307 3.307 183.750 9.21e-08 ***
Treatment: InteractionDC 1 0.001 0.001 0.046 0.83396
Treatment: TreatsvsControl 1 2.360 2.360 131.120 4.53e-07 ***
Residuals 10 0.180 0.018
# # #