The Cochran–Mantel–Haenszel test is an extension of the chi-square test of association. It is used for multiple chi-square tests across multiple groups or times. The data are stratified so that each chi-square table is within one group or time.
The following data investigate whether there is a link between listening to podcasts and using public transportation to get to work, collected across three cities. This can be thought of as a 2 x 2 contingency table in each of the three cities.
Data can be arranged in a table of counts or can be arranged in long-format with or without counts. If a table is used for input, it should follow R’s ftable format, as shown.
Table format
City Bikini.Bottom Frostbite.Falls New.New.York
Listen Transport
Podcast Drive 13 17 5
Public 27 25 27
No.podcast Drive 23 22 17
Public 44 31 22
Long-format with counts
City Listen Transport Count
Bikini.Bottom Podcast Drive 13
Bikini.Bottom Podcast Public 27
Bikini.Bottom No.podcast Drive 23
Bikini.Bottom No.podcast Public 44
Frostbite.Falls Podcast Drive 17
Frostbite.Falls Podcast Public 25
Frostbite.Falls No.podcast Drive 22
Frostbite.Falls No.podcast Public 31
New.New.York Podcast Drive 5
New.New.York Podcast Public 27
New.New.York No.podcast Drive 17
New.New.York No.podcast Public 22
The test can be conducted with the mantelhaen.test function in the native stats package.
One assumption of the test is that there are no three-way interactions in the data. This is confirmed with a non-significant result from a test such as the Woolf test or Breslow–Day test.
Post-hoc analysis can include looking at the individual chi-square, Fisher exact, or G-test for association for each time or group.
The component n x n tables can be 2 x 2 or larger.
Appropriate data
• Three nominal variables with two or more levels each.
• Data can be stratified as n x n tables with the third time or grouping variable
Hypotheses
• Null hypothesis: There is no association between the two inner variables.
• Alternative hypothesis (two-sided): There is an association between the two inner variables.
Interpretation
Significant results can be reported as “There was a significant association between variable A and variable B [across groups].”
Post-hoc analysis
Post-hoc analysis can include looking at the individual chi-square, Fisher exact, or g-test for association for each time or group.
Packages used in this chapter
The packages used in this chapter include:
• psych
• vcd
• DescTools
• rcompanion
The following commands will install these packages if they are not already installed:
if(!require(psych)){install.packages("psych")}
if(!require(vcd)){install.packages("vcd")}
if(!require(DescTools)){install.packages("DescTools")}
if(!require(rcompanion)){install.packages("rcompanion")}
C–M–H test example: long-format with counts
Alexander Anderson is concerned that there is a bias in his teaching methods for his pesticide applicator’s course. He wants to know if there is an association between students’ sex and passing the course across the four counties in which he teaches. The following are his data.
Input = ("
County Sex Result Count
Bloom Female Pass 9
Bloom Female Fail 5
Bloom Male Pass 7
Bloom Male Fail 17
Cobblestone Female Pass 11
Cobblestone Female Fail 4
Cobblestone Male Pass 9
Cobblestone Male Fail 21
Dougal Female Pass 9
Dougal Female Fail 7
Dougal Male Pass 19
Dougal Male Fail 9
Heimlich Female Pass 15
Heimlich Female Fail 8
Heimlich Male Pass 14
Heimlich Male Fail 17
")
Data = read.table(textConnection(Input),header=TRUE)
### Order factors otherwise R will alphabetize them
Data$County = factor(Data$County,
levels=unique(Data$County))
Data$Sex = factor(Data$Sex,
levels=unique(Data$Sex))
Data$Result = factor(Data$Result,
levels=unique(Data$Result))
### Check the data frame
library(psych)
headTail(Data)
str(Data)
summary(Data)
### Remove unnecessary objects
rm(Input)
Convert data to a table
Table = xtabs(Count ~ Sex + Result + County,
data=Data)
### Note that the grouping variable is last in
the xtabs function
ftable(Table) # Display a
flattened table
County Bloom Cobblestone Dougal Heimlich
Sex Result
Female Pass 9 11 9 15
Fail 5 4 7 8
Male Pass 7 9 19 14
Fail 17 21 9 17
Cochran–Mantel–Haenszel test
mantelhaen.test(Table)
Mantel-Haenszel chi-squared test with continuity correction
Mantel-Haenszel X-squared = 6.7314, df = 1, p-value = 0.009473
alternative hypothesis: true common odds ratio is not equal to 1
Woolf test
library(vcd)
woolf_test(Table)
### Woolf test for homogeneity of odds ratios
across strata.
### If significant, C-M-H test is not appropriate
Woolf-test on Homogeneity of Odds Ratios (no 3-Way assoc.)
X-squared = 7.1376, df = 3, p-value = 0.06764
Post-hoc analysis
The groupwiseCMH function will conduct analysis of the component n x n tables with Fisher exact, g-test, or chi-square tests of association. It accepts only a 3-dimensional table. The group option indicates which dimension should be considered the grouping variable (1, 2, or 3). It will conduct only one type of test at a time. That is, if multiple of the options fisher, gtest, or chisq are set to TRUE, it will conduct only one of them. As usual, method is the p-value adjustment method (see ?p.adjust for options), and digits indicates the number of digits in the output. The correct option is used by the chi-square test function.
library(rcompanion)
groupwiseCMH(Table,
group = 3,
fisher = TRUE,
gtest = FALSE,
chisq = FALSE,
method = "fdr",
correct = "none",
digits = 3)
Group Test p.value adj.p
1 Bloom Fisher 0.0468 0.0936
2 Cobblestone Fisher 0.0102 0.0408
3 Dougal Fisher 0.5230 0.5230
4 Heimlich Fisher 0.1750 0.2330
C–M–H test example: table format
The read.ftable function can be very fussy about the formatting of the input. 1) It seems to not like a blank first line, so the double quote symbol in the input should be on the same line as the column names. 2) It doesn’t like leading spaces on the input lines. These may appear when you paste the code in to the RStudio Console or R Script area. One solution is to manually delete these spaces. R Script files that are saved without these leading spaces should be able to be opened and run without further modification.
The Cochran–Mantel–Haenszel test, Woolf test, and post-hoc analysis would be the same as those conducted on Table above.
Input =(
" County Bloom Cobblestone Dougal Heimlich
Sex Result
Female Pass 9 11 9 15
Fail 5 4 7 8
Male Pass 7 9 19 14
Fail 17 21 9 17
")
Table = as.table(read.ftable(textConnection(Input)))
ftable(Table)
### Display a flattened table