[banner]

An R Companion for the Handbook of Biological Statistics

Salvatore S. Mangiafico

Contrasts in Linear Models

Contrasts within linear models

Contrasts can be used to make specific comparisons of treatments within a linear model.

 

One common use is when a factorial design is used, but control or check treatments are used in addition to the factorial design.  In the first example below, there are two treatments (D and C) each at two levels (1 and 2), and then there is a Control treatment. The approach used here is to analyze the experiment as a one-way analysis of variance, and then use contrasts to test various hypotheses.

 

Another common use is when there are several treatments that could be thought of as members of a group.  In the second example below,  there are measurements for six wines, some of which are red (Merlot, Cabernet, Syrah) and some of which are white (Chardonnay, Riesling, Gewürztraminer).  We can compare the means for treatments within the red wine group by setting up contrasts and conducting a global F-test.  This is analogous to testing the main effect of Red Wine.

 

Tests of contrasts with multcomp

The package multcomp allows for unlimited tests of single-degree contrasts, with a p-value correction for multiple tests.

 

It also allows for a global F-Test for multiple contrasts, for example when testing within groups.

 

See the chapters on One-way Anova and Two-way Anova for general considerations on conducting analysis of variance

 

Example for single degree-of-freedom contrasts

This hypothetical example could represent an experiment with a factorial design two treatments (D and C) each at two levels (1 and 2), and a control treatment.  The 2-by-2 factorial plus control is treated as a one-way anova with five treatments.

 

### --------------------------------------------------------------
### Tests of contrasts with multcomp, hypothetical example
### --------------------------------------------------------------


Input = ("
Treatment   Response
 'D1:C1'    1.0
 'D1:C1'    1.2
 'D1:C1'    1.3
 'D1:C2'    2.1
 'D1:C2'    2.2
 'D1:C2'    2.3
 'D2:C1'    1.4
 'D2:C1'    1.6
 'D2:C1'    1.7
 'D2:C2'    2.5
 'D2:C2'    2.6
 'D2:C2'    2.8
 'Control'  1.0
 'Control'  0.9
 'Control'  0.8
")

Data = read.table(textConnection(Input),header=TRUE)


### Specify the order of factor levels. Otherwise R will alphabetize them.


Data$Treatment = factor(Data$Treatment,
                        levels=unique(Data$Treatment))


Data


boxplot(Response ~ Treatment,
        data = Data,
        ylab="Response",
        xlab="Treatment")



###  You need to look at order of factor levels to determine the contrasts

levels(Data$Treatment)     

[1] "D1:C1"   "D1:C2"   "D2:C1"   "D2:C2"   "Control"

 

###  Define linear model

model = lm(Response ~ Treatment,
           data = Data)

library(car)

Anova(model, type="II")

summary(model)

###  Define contrasts and produce results

Input = ("
Contrast.Name     D1C2  D1C2 D2C1 D2C2  Control
 D1vsD2            1     1   -1   -1     0
 C1vsC2            1    -1    1   -1     0
 InteractionDC     1    -1   -1    1     0
 C1vsC2forD1only   1    -1    0    0     0
 C1vsC2forD2only   0     0    1   -1     0
 TreatsvsControl   1     1    1    1    -4
 T1vsC             1     0    0    0    -1
 T2vsC             0     1    0    0    -1
 T3vsC             0     0    1    0    -1
 T4vsC             0     0    0    1    -1
")

   ### The column names match the order of levels of the treatment variable

   ### The coefficients of each row sum to 0


Matriz = as.matrix(read.table(textConnection(Input),
                   header=TRUE,
                   row.names=1))

Matriz


library(multcomp)

G = glht(model,
         linfct = mcp(Treatment = Matriz))

G$linfct

summary(G,
        test=adjusted("single-step"))

   ### Adjustment options: "none", "single-step", "Shaffer",
   ###                     "Westfall", "free", "holm", "hochberg",
   ###                     "hommel", "bonferroni", "BH", "BY", "fdr"


                     Estimate Std. Error t value Pr(>|t|)   

D1vsD2 == 0          -0.83333    0.15492  -5.379  0.00218 **

C1vsC2 == 0          -2.10000    0.15492 -13.555  < 0.001 ***

InteractionDC == 0    0.03333    0.15492   0.215  0.99938   

C1vsC2forD1only == 0 -1.03333    0.10954  -9.433  < 0.001 ***

C1vsC2forD2only == 0 -1.06667    0.10954  -9.737  < 0.001 ***

TreatsvsControl == 0  3.96667    0.34641  11.451  < 0.001 ***

T1vsC == 0            0.26667    0.10954   2.434  0.17428   

T2vsC == 0            1.30000    0.10954  11.867  < 0.001 ***

T3vsC == 0            0.66667    0.10954   6.086  < 0.001 ***

T4vsC == 0            1.73333    0.10954  15.823  < 0.001 ***


### With test=adjusted("none"), results will be the same as aov method below.

 

#     #     #

 

 

Example for global F-test within a group of treatments

This example has treatments consisting of three red wines and three white wines.  We will want to know if there is an effect of the treatments in the red wine group on the response variable, while keeping the individual identities of the wines in the Treatment variable.  This approach is advantageous because post-hoc comparisons could still be made within the red wines, for example comparing Merlot to Cabernet.

 

### --------------------------------------------------------------
### Tests of contrasts with multcomp, hypothetical example 2
### --------------------------------------------------------------


Input = ("
Treatment          Response
 Merlot             5
 Merlot             6
 Merlot             7
 Cabernet           8
 Cabernet           9
 Cabernet          10
 Syrah             11
 Syrah             12
 Syrah             13

 Chardonnay         1
 Chardonnay         2
 Chardonnay         3
 Riesling           1
 Riesling           2
 Riesling           2
 Gewürtztraminer    1
 Gewürtztraminer    2
 Gewürtztraminer    4
")

Data = read.table(textConnection(Input),header=TRUE)


### Specify the order of factor levels. Otherwise R will alphabetize them.

Data$Treatment = factor(Data$Treatment,
                        levels=unique(Data$Treatment))


Data


boxplot(Response ~ Treatment,
        data = Data,
        ylab="Response",
        xlab="Treatment")





###  You need to look at order of factor levels to determine the contrasts

levels(Data$Treatment)

[1] "Merlot"  "Cabernet"  "Syrah"  "Chardonnay"  "Riesling"  "Gewürtztraminer"


###  Define linear model

model = lm(Response ~ Treatment,
           data = Data)

library(car)

Anova(model, type="II")

summary(model)


Question: Is there an effect within red wine ?


Input = "
Contrast    Merlot  Cabernet  Syrah  Chardonnay  Riesling  Gewürtztraminer
 Red_line1  1       -1         0     0           0         0
 Red_line2  0        1        -1     0           0         0
"

  
### Note: there are two lines of contrasts for a group of three treatments

   ### The column names match the order of levels of the treatment variable
   ### The coefficients of each row sum to 0


Matriz = as.matrix(read.table(textConnection(Input),
                   header=TRUE,
                   row.names=1))


Matriz

library(multcomp)

G = glht(model, linfct = Matriz)

G$linfct


summary(G,
        test = Ftest())


Global Test:
      F  DF1  DF2    Pr(>F)
1  6.75    2   12   0.01086

### Note that two lines of contrasts resulted in one hypothesis test
###   using 2 degrees of freedom.  This investigated the effect within

###   a group of 3 treatments.


Question: Is there an effect within white wine ?


Input = "
Contrast    Merlot  Cabernet  Syrah  Chardonnay  Riesling  Gewürtztraminer
 Red_line1  0       0         0      1           -1          0
 Red_line2  0       0         0      0            1         -1
"

  
### Note: there are two lines of contrasts for a group of three treatments

   ### The column names match the order of levels of the treatment variable
   ### The coefficients of each row sum to 0



Matriz = as.matrix(read.table(textConnection(Input),
                   header=TRUE,
                   row.names=1))


Matriz

library(multcomp)

G = glht(model, linfct = Matriz)

G$linfct


summary(G,
        test = Ftest())


Global Test:
    F DF1 DF2 Pr(>F)
1 0.3   2  12 0.7462

### Note that two lines of contrasts resulted in one hypothesis test
###   using 2 degrees of freedom.  This investigated the effect within

###   a group of 3 treatments.

 

#     #     #

 

 

Question: Is there a difference between red and white wines?  And, mean separation for red wine


Input = "
Contrast          Merlot  Cabernet  Syrah  Chardonnay  Riesling  Gewürtztraminer
 Red_vs_white      1        1         1     -1          -1        -1
 Merlot_vs_Cab     1       -1         0      0           0         0
 Cab_vs_Syrah      0        1        -1      0           0         0      
 Syrah_vs_Merlot  -1        0         1      0           0         0         
"


   ### The column names match the order of levels of the treatment variable
   ### The coefficients of each row sum to 0



Matriz = as.matrix(read.table(textConnection(Input),
                   header=TRUE,
                   row.names=1))

Matriz
library(multcomp)

G = glht(model,
         linfct = mcp(Treatment = Matriz))

G$linfct

summary(G,
        test=adjusted("single-step"))

   ### Adjustment options: "none", "single-step", "Shaffer",
   ###                     "Westfall", "free", "holm", "hochberg",
   ###                     "hommel", "bonferroni", "BH", "BY", "fdr"


Linear Hypotheses:
                     Estimate Std. Error t value Pr(>|t|)   
Red_vs_white == 0     21.0000     1.4907  14.087   <0.001 ***

Merlot_vs_Cab == 0    -3.0000     0.8607  -3.486   0.0157 * 
Cab_vs_Syrah == 0     -3.0000     0.8607  -3.486   0.0156 * 
Syrah_vs_Merlot == 0   6.0000     0.8607   6.971   <0.001 ***

(Adjusted p values reported -- single-step method)


#     #     #

Tests of contrasts within aov

Another method to use single-degree-of-freedom contrasts within an anova is to use the split option within the summary function for an aov analysis.  The number of degrees of freedom that a factor can be split into for contrast tests is limited.

 

### --------------------------------------------------------------
### Tests of contrasts within aov, hypothetical example
### --------------------------------------------------------------


Input =("
Treatment   Response
 'D1:C1'    1.0
 'D1:C1'    1.2
 'D1:C1'    1.3
 'D1:C2'    2.1
 'D1:C2'    2.2
 'D1:C2'    2.3
 'D2:C1'    1.4
 'D2:C1'    1.6
 'D2:C1'    1.7
 'D2:C2'    2.5
 'D2:C2'    2.6
 'D2:C2'    2.8
 'Control'  1.0
 'Control'  0.9
 'Control'  0.8
")

Data = read.table(textConnection(Input),header=TRUE)

Data$Treatment = factor(Data$Treatment, levels=unique(Data$Treatment))

 

   ### Specify the order of factor levels. Otherwise R will alphabetize them.

Data

boxplot(Response ~ Treatment,
        data = Data,
        ylab="Response",
        xlab="Treatment")






levels(Data$Treatment)

     ###  You need to look at order of factor levels to determine the contrasts

[1] "D1:C1"   "D1:C2"   "D2:C1"   "D2:C2"   "Control"

 

###  Define contrasts

D1vsD2 =          c(1,  1, -1, -1,  0)
C1vsC2 =          c(1, -1,  1, -1,  0)
InteractionDC =   c(1, -1, -1,  1,  0)
TreatsvsControl = c(1,  1,  1,  1, -4)

Matriz = cbind(D1vsD2, C1vsC2,
               InteractionDC, TreatsvsControl)

contrasts(Data$Treatment) = Matriz

CList = list("D1vsD2" = 1,
             "C1vsC2" = 2,
             "InteractionDC" = 3,
             "TreatsvsControl" = 4)


###  Define model and display summary

model = aov(Response ~ Treatment, data = Data)

summary(model,
        split=list(Treatment=CList))

                             Df Sum Sq Mean Sq F value   Pr(>F)   

Treatment                     4  6.189   1.547  85.963 1.06e-07 ***

  Treatment: D1vsD2           1  0.521   0.521  28.935  0.00031 ***

  Treatment: C1vsC2           1  3.307   3.307 183.750 9.21e-08 ***

  Treatment: InteractionDC    1  0.001   0.001   0.046  0.83396   

  Treatment: TreatsvsControl  1  2.360   2.360 131.120 4.53e-07 ***

Residuals                    10  0.180   0.018                    

 

#     #     #