The problem with multiple comparisons
See the Handbook for information on this topic. Also see sections of this book with the terms “multiple comparisons”, “Tukey”, “pairwise”, “post-hoc”, “p.adj”, “p.adjust”, “p.method”, or “adjust”.
Controlling the familywise error rate: Bonferroni correction
Example is shown below in the “How to do the tests” section
Controlling the false discovery rate: Benjamini–Hochberg procedure
Example is shown below in the “How to do the tests” section
Assumption
When not to correct for multiple comparisons
See the Handbook for information on these topics.
How to do the tests
R has built in methods to adjust a series of p-values either to control the family-wise error rate or to control the false discovery rate.
The methods Holm, Hochberg, Hommel, and Bonferroni control the family-wise error rate. These methods attempt to limit the probability of even one false discovery (a type I error, incorrectly rejecting the null hypothesis when there is no real effect), and so are all relatively strong (conservative).
The methods BH (Benjamini–Hochberg, which is the same as FDR in R) and BY control the false discovery rate. These methods attempt to control the expected proportion of false discoveries.
For more information on these methods, see ?p.adjust or other resources.
Note that these methods require only the p-values to adjust and the number of p-values that are being compared. This is different from methods such as Tukey or Dunnett that require also the variability of the underlying data. Tukey and Dunnett are considered familywise error rate methods.
To get some sense of how conservative these different adjustments are, see the two plots below in this chapter.
There is no definitive advice on which p-value adjustment measure to use. In general, you should choose a method which will be familiar to your audience or in your field of study. In addition, there may be some logic which allows you to choose how you balance the probability of making a type I error relative to a type II error. For example, in a preliminary study, you might want to keep as many significant values as possible to not exclude potentially significant factors from future studies. On the other hand, in a medical study where people’s lives are at stake and very expensive treatments are being considered, you would want to have a very high level of certainty before concluding that one treatment is better than another.
Multiple comparisons example with 25 p-values
###
--------------------------------------------------------------
### Multiple comparisons example, p. 262–263
### --------------------------------------------------------------
Input = ("
Food Raw.p
Blue_fish .34
Bread .594
Butter .212
Carbohydrates .384
Cereals_and_pasta .074
Dairy_products .94
Eggs .275
Fats .696
Fruit .269
Legumes .341
Nuts .06
Olive_oil .008
Potatoes .569
Processed_meat .986
Proteins .042
Red_meat .251
Semi-skimmed_milk .942
Skimmed_milk .222
Sweets .762
Total_calories .001
Total_meat .975
Vegetables .216
White_fish .205
White_meat .041
Whole_milk .039
")
Data = read.table(textConnection(Input),header=TRUE)
### Order data by p-value
Data = Data[order(Data$Raw.p),]
### Check if data is ordered the way we
intended
library(FSA)
headtail(Data)
Food Raw.p
20 Total_calories 0.001
12 Olive_oil 0.008
25 Whole_milk 0.039
17 Semi-skimmed_milk 0.942
21 Total_meat 0.975
14 Processed_meat 0.986
### Perform p-value adjustments and add
to data frame
Data$Bonferroni =
p.adjust(Data$Raw.p,
method = "bonferroni")
Data$BH =
p.adjust(Data$Raw.p,
method = "BH")
Data$Holm =
p.adjust(Data$ Raw.p,
method = "holm")
Data$Hochberg =
p.adjust(Data$ Raw.p,
method = "hochberg")
Data$Hommel =
p.adjust(Data$ Raw.p,
method = "hommel")
Data$BY =
p.adjust(Data$ Raw.p,
method = "BY")
Data
Food Raw.p Bonferroni BH Holm Hochberg Hommel BY
20 Total_calories 0.001 0.025 0.0250000 0.025 0.025 0.025 0.09539895
12 Olive_oil 0.008 0.200 0.1000000 0.192 0.192 0.192 0.38159582
25 Whole_milk 0.039 0.975 0.2100000 0.897 0.882 0.682 0.80135122
24 White_meat 0.041 1.000 0.2100000 0.902 0.882 0.697 0.80135122
15 Proteins 0.042 1.000 0.2100000 0.902 0.882 0.714 0.80135122
11 Nuts 0.060 1.000 0.2500000 1.000 0.986 0.840 0.95398954
5 Cereals_and_pasta 0.074 1.000 0.2642857 1.000 0.986 0.962 1.00000000
23 White_fish 0.205 1.000 0.4910714 1.000 0.986 0.986 1.00000000
3 Butter 0.212 1.000 0.4910714 1.000 0.986 0.986 1.00000000
22 Vegetables 0.216 1.000 0.4910714 1.000 0.986 0.986 1.00000000
18 Skimmed_milk 0.222 1.000 0.4910714 1.000 0.986 0.986 1.00000000
16 Red_meat 0.251 1.000 0.4910714 1.000 0.986 0.986 1.00000000
9 Fruit 0.269 1.000 0.4910714 1.000 0.986 0.986 1.00000000
7 Eggs 0.275 1.000 0.4910714 1.000 0.986 0.986 1.00000000
1 Blue_fish 0.340 1.000 0.5328125 1.000 0.986 0.986 1.00000000
10 Legumes 0.341 1.000 0.5328125 1.000 0.986 0.986 1.00000000
4 Carbohydrates 0.384 1.000 0.5647059 1.000 0.986 0.986 1.00000000
13 Potatoes 0.569 1.000 0.7815789 1.000 0.986 0.986 1.00000000
2 Bread 0.594 1.000 0.7815789 1.000 0.986 0.986 1.00000000
8 Fats 0.696 1.000 0.8700000 1.000 0.986 0.986 1.00000000
19 Sweets 0.762 1.000 0.9071429 1.000 0.986 0.986 1.00000000
6 Dairy_products 0.940 1.000 0.9860000 1.000 0.986 0.986 1.00000000
17 Semi-skimmed_milk 0.942 1.000 0.9860000 1.000 0.986 0.986 1.00000000
21 Total_meat 0.975 1.000 0.9860000 1.000 0.986 0.986 1.00000000
14 Processed_meat 0.986 1.000 0.9860000 1.000 0.986 0.986 1.00000000
Plot
X = Data$Raw.p
Y = cbind(Data$Bonferroni,
Data$BH,
Data$Holm,
Data$Hochberg,
Data$Hommel,
Data$BY)
matplot(X, Y,
xlab="Raw p-value",
ylab="Adjusted p-value",
type="l",
asp=1,
col=1:6,
lty=1,
lwd=2)
legend('bottomright',
legend = c("Bonferroni", "BH", "Holm",
"Hochberg", "Hommel", "BY"),
col = 1:6,
cex = 1,
pch = 16)
abline(0, 1,
col=1,
lty=2,
lwd=1)
Plot of adjusted p-values vs. raw p-values for a series of 25 p-values. The dashed line represents a one-to-one line.
# # #
Multiple comparisons example with five p-values
###
--------------------------------------------------------------
### Multiple comparisons example, hypothetical example
### --------------------------------------------------------------
Input = ("
Factor Raw.p
A .001
B .01
C .025
D .05
E .1
")
Data = read.table(textConnection(Input),header=TRUE)
### Perform p-value adjustments and add to data
frame
Data$Bonferroni =
p.adjust(Data$Raw.p,
method = "bonferroni")
Data$BH =
signif(p.adjust(Data$Raw.p,
method = "BH"),
4)
Data$Holm =
p.adjust(Data$ Raw.p,
method = "holm")
Data$Hochberg =
p.adjust(Data$ Raw.p,
method = "hochberg")
Data$Hommel =
p.adjust(Data$ Raw.p,
method = "hommel")
Data$BY =
signif(p.adjust(Data$ Raw.p,
method = "BY"),
4)
Data
Factor Raw.p Bonferroni BH Holm Hochberg Hommel BY
1 A 0.001 0.005 0.00500 0.005 0.005 0.005 0.01142
2 B 0.010 0.050 0.02500 0.040 0.040 0.040 0.05708
3 C 0.025 0.125 0.04167 0.075 0.075 0.075 0.09514
4 D 0.050 0.250 0.06250 0.100 0.100 0.100 0.14270
5 E 0.100 0.500 0.10000 0.100 0.100 0.100 0.22830
Plot
X = Data$Raw.p
Y = cbind(Data$Bonferroni,
Data$BH,
Data$Holm,
Data$Hochberg,
Data$Hommel,
Data$BY)
matplot(X, Y,
xlab="Raw p-value",
ylab="Adjusted p-value",
type="l",
asp=1,
col=1:6,
lty=1,
lwd=2)
legend('bottomright',
legend = c("Bonferroni", "BH", "Holm",
"Hochberg", "Hommel", "BY"),
col = 1:6,
cex = 1,
pch = 16)
abline(0, 1,
col=1,
lty=2,
lwd=1)
Plot of adjusted p-values vs. raw p-values for a series of five p-values between 0 and 0.1. Note that Holm and Hochberg have the same values as Hommel, and so are hidden by Hommel. The dashed line represents a one-to-one line.
# # #