Comparisons of estimated marginal means can be used to assess custom contrasts. These might include comparisons among groups of treatments or a linear trend across treatments.
For a couple of simple examples, we’ll revisit the midichlorians data from the Estimated Marginal Means for Multiple Comparisons chapter. And use a simple model of Midichlorians vs. Location. There is a relevant plot at the end of the chapter.
Data = read.table(header=TRUE, stringsAsFactors=TRUE, text="
Tribe Location Midichlorians
Jedi Olympia 10
Sith Olympia 4
Jedi Olympia 12
Sith Olympia 5
Jedi Olympia 15
Sith Olympia 4
Jedi Ventura 15
Sith Ventura 9
Jedi Ventura 15
Sith Ventura 11
Jedi Ventura 18
Sith Ventura 12
Jedi Northampton 8
Sith Northampton 13
Jedi Northampton 8
Sith Northampton 15
Jedi Northampton 10
Sith Northampton 17
Jedi Burlington 22
Sith Burlington 22
Jedi Burlington 20
Sith Burlington 22
Jedi Burlington 20
Sith Burlington 25
")
Custom contrasts to compare groups of treatments
Imagine we want to compare the midichlorian counts for locations on the east coast of the U.S. and the west coast of the U.S. With the data we have, that would entail comparing (Burlington and Northampton) vs. (Olympia and Ventura).
Let’s also compare locations in states beginning with letters A–M, and locations in states beginning with letters N–Z. So, (Burlington and Olympia) vs. (Northampton and Ventura).
For more complete and varied examples for this kind of analysis, see rcompanion.org/rcompanion/h_01.html or the Post-hoc Contrasts in Models chapter in the An R Companion for the Handbook of Biological Statistics book.
First, we’ll need to look at the order of factors in our data. This will tell us the order of coefficients for the contrasts.
levels(Data$Location)
"Burlington" "Northampton" "Olympia"
"Ventura"
If we wanted to use a different order, we could re-order the factors.
Data$Location = factor(Data$Location,
levels=c("Burlington",
"Northampton", "Olympia", "Ventura"))
levels(Data$Location)
"Burlington" "Northampton" "Olympia"
"Ventura"
Create model and test the contrasts
model = lm(Midichlorians ~ Location, data = Data)
library(emmeans)
marginal = emmeans(model, ~ Location)
Contrasts = list(EastCoastWestCoast = c( 1, 1, -1, -1),
StatesNZStatesAM = c( 1, -1, 1, -1))
Test = contrast(marginal, Contrasts)
test(Test)
contrast estimate SE df t.ratio p.value
EastCoastWestCoast 12 2.89 20 4.154 0.0005
StatesNZStatesAM 5 2.89 20 1.731 0.0989
Here, the results suggest that there is a significant difference between east coast locations (Burlington and Northampton) and west coast locations (Olympia and Ventura). This makes sense given the plot below.
However, the other contrast we tested — (Burlington and Olympia) vs. (Northampton and Ventura) — was not significant.
Polynomial contrasts as a test of trend
As a different flavor of this analysis, imagine we want to see if there an increasing or decreasing trend across the locations in alphabetical order. This is a useful technique when the independent variable is a categorical variable that can be ordered. This variable is treated as categorical, but we also examine trends treating it as an ordinal variable.
For more complete examples for this kind of analysis, see rcompanion.org/rcompanion/h_03.html or the Polynomial Contrasts in Models chapter in the An R Companion for the Handbook of Biological Statistics book.
First, we’ll need to look at the order of factors in our data. This will tell us the order of coefficients for the contrasts.
levels(Data$Location)
"Burlington" "Northampton" "Olympia"
"Ventura"
If we wanted to use a different order, we could re-order the factors.
Data$Location = factor(Data$Location,
levels=c("Burlington",
"Northampton", "Olympia", "Ventura"))
levels(Data$Location)
"Burlington" "Northampton" "Olympia"
"Ventura"
Create model and test the contrasts
model = lm(Midichlorians ~ Location, data = Data)
library(emmeans)
marginal = emmeans(model, ~ Location)
Contrasts = list(Linear = c( -3, -1, 1, 3),
Quadratic = c( 1, -1, -1, 1),
Cubic = c( -1, 3, -3, 1))
Test = contrast(marginal, Contrasts)
test(Test)
contrast estimate SE df t.ratio p.value
Linear -29 6.46 20 -4.490 0.0002
Quadratic 15 2.89 20 5.193 <.0001
Cubic 2 6.46 20 0.310 0.76
Here, the results suggest that there is a significant linear trend across Locations in alphabetical order. Looking at the plot below, it’s clear that there is a decrease in Midichlorians as alphabetical Location increases.
Also, there is a significant quadratic trend. In this case, there is the suggestion that the mean for Ventura is greater than that for Olympia, suggesting it begins an upward trend for Locations later in alphabetical order.
Plot
Out = as.data.frame(emmeans(model, ~ Location))
library(ggplot2)
png(filename = "MidichlorianPlot.png",
width = 5,
height = 3.75,
units = "in",
res = 300)
ggplot(Out,
aes(x = Location,
y = emmean)) +
geom_point(shape = 15,
size = 4) +
geom_errorbar(aes(ymin = lower.CL,
ymax = upper.CL),
width = 0.1,
size = 0.7) +
theme_bw() +
theme(axis.title = element_text(face = "bold"),
axis.text = element_text(face = "bold"),
plot.caption = element_text(hjust = 0)) +
xlab("\nLocation") +
ylab("Estimated marginal mean\nmidichlorian count\n")
dev.off()
