These tests for nominal variables are used to determine if two nominal variables are associated. Sometimes the term “independent” is used to mean that there is no association.
In general, practically speaking, there are no assumptions about the distribution of data for these tests.
For these tests of association there shouldn’t be paired values. For example, if experimental units—the things you are counting—are “students before” and “students after”, or “left hands” and “right hands”, the tests in the chapter Tests for Paired Nominal Data may be more appropriate.
Also note that these tests will not be accurate if there are “structural zeros” in the contingency table. If you were counting pregnant and non-pregnant individuals across categories like male and female, the male–pregnant cell may contain a structural zero if you assume your population cannot have pregnant males.
Low cell counts
The results of chi-square tests and G-tests can be inaccurate if statistically expected cell counts are low. A rule of thumb for 2 x 2 tables is that all statistically expected counts should be 5 or greater for chi-square- and G-tests. For tables larger than 2 x 2, a rule of thumb is that no more than 20% of the statistically expected counts are less than 5, and that all statistically expected counts are at least 1.
One approach when there are low statistically expected counts is to use exact tests, such as Fisher’s exact test, which are not bothered by low cell counts.
It is also possible to use Monte Carlo simulation.
Continuity correction
When counts in a contingency table are relatively small, it is helpful to apply a continuity correction to adjust the chi-square value and the corresponding p-value for the fact that the data are discrete and the chi-square distribution is continuous.
The chisq.test function automatically applies the Yates's continuity correction for 2 x 2 tables (correct=TRUE). The GTest function in the DescTools package has options for Yates or Williams corrections.
Fisher’s exact test
Technically, Fisher’s exact test operates as if the table had fixed margins. (The test statistic is conditioned on the table margins). For example, if the table has counts of males and females along with some other variable, the test takes the total number of males tested and the total number of females tested as given, as if they were determined ahead of time. However, it appears to me that this consideration is commonly ignored for Fisher’s exact test.
For 2 x 2 tables, there are Barnard’s test and Boschloo’s test that aren’t conditioned on margin totals, and tend to be somewhat more powerful than Fisher’s exact test.
Choosing among tests
If there are not low expected counts, using G-test or chi-square test is fine. The advantage of chi-square tests is that your audience may be more familiar with them. That being said, some authors recommend the using Fisher’s exact test routinely unless the number of observations is so great that the analysis takes too long.
Appropriate data
• Two nominal variables with two or more levels each, often arranged into a contingency table.
• Experimental units aren’t paired.
• There are no structural zeros in the contingency table.
• G-test and chi-square test may not be appropriate if there are cells with low expected counts.
Hypotheses
• Null hypothesis: There is no association between the two variables.
• Alternative hypothesis (two-sided): There is an association between the two variables.
Interpretation
Significant results can be reported as “There was a significant association between variable A and variable B.”
Post-hoc analysis
Post hoc analysis for tests on a contingency table larger than 2 x 2 can be conducted by examining standardized residuals or by conducting tests for the component 2 x n tables. In the latter case, a correction for multiple tests could be applied.
Other notes and alternative tests
• For paired data, see the tests in the chapter Tests for Paired Nominal Data.
• For tables with 3 dimensions, the Cochran–Mantel–Haenszel test can be used.
Packages used in this chapter
The packages used in this chapter include:
• DescTools
• multcompView
• rcompanion
• coin
The following commands will install these packages if they are not already installed:
if(!require(DescTools)){install.packages("DescTools")}
if(!require(multcompView)){install.packages("multcompView")}
if(!require(rcompanion)){install.packages("rcompanion")}
if(!require(coin)){install.packages("coin")
Association tests for nominal variables example
Alexander Anderson runs the pesticide safety training course in four counties. Students must pass in order to obtain their pesticide applicator’s license. He wishes to see if there is an association between the county in which the course was held and the rate of passing the test. The following are his data.
County Pass Fail
Bloom County 21 5
Cobblestone County 6 11
Dougal County 7 8
Heimlich County 27 5
Reading the data as a matrix
Matrix = as.matrix(read.table(header=TRUE, row.names=1, text="
County Pass Fail
Bloom 21 5
Cobblestone 6 11
Dougal 7 8
Heimlich 27 5
"))
Matrix
Expected cell counts
The chisq.test function can be used to identify the statistically expected counts for a contingency table.
Note in the results here that one cell has an expected count below 5, but that all expected counts are at least 1, and that cells with expected counts below 5 are less than 20% of cells. (1 / 8 cells = 13%).
Test = chisq.test(Matrix)
Test$expected
Pass Fail
Bloom 17.62222 8.377778
Cobblestone 11.52222 5.477778
Dougal 10.16667 4.833333
Heimlich 21.68889 10.311111
Statistically expected counts
Effect size
See the chapter Measures of Association for Nominal Variables for a discussion of effect size statistics and their interpretation for contingency tables of nominal variables.
library(rcompanion)
cramerV(Matrix, digits=3)
Cramer V
0.439
### Note: k = 2 for this table
### as the minimum categories in one dimension is 2.
cramerV(Matrix, ci=TRUE, digits=3)
Cramer.V lower.ci upper.ci
1 0.439 0.271 0.644
### Confidence intervals by bootstrap may vary
Chi-square test of association
chisq.test(Matrix)
Pearson's Chi-squared test
X-squared = 17.32, df = 3, p-value = 0.0006072
Monte Carlo simulation
chisq.test(Matrix, simulate.p.value=TRUE, B=10000)
Pearson's Chi-squared test with simulated p-value (based on 10000 replicates)
X-squared = 17.32, df = NA, p-value = 0.0005999
### Values by Monte Carlo simulation may vary
Coin package
A chi-square test of association can also be conducted with the coin package.
Tabla = as.table(Matrix)
chisq_test(Tabla)
Asymptotic Pearson Chi-Squared Test
chi-squared = 17.32, df = 3, p-value = 0.0006072
Monte Carlo simulation
chisq_test(Tabla, distribution="approximate", nresample=10000)
Approximative Pearson Chi-Squared Test
chi-squared = 17.32, p-value = 5e-04
### Values by Monte Carlo simulation may vary
Post-hoc analysis
Standardized residuals
One post-hoc approach is to look at the standardized residuals from the chi-square analysis. Cells with a standardized residual whose absolute value is greater than 1.96 indicate a cell differing from the expected value. (The 1.96 cutoff is analogous to alpha = 0.05 for a hypothesis test, and 2.58 for alpha = 0.01.)
Here, Cobblestone and Heimlich Counties have standardized residuals with absolute values greater than 1.96. Looking at the sign of the residuals and the original values, it’s clear that Heimlich County has a high pass rate and Cobblestone County has a high fail rate.
chisq.test(Matrix)$stdres
Pass Fail
Bloom 1.680948 -1.680948
Cobblestone -3.182203 3.182203
Dougal -1.916576 1.916576
Heimlich 2.502628 -2.502628
Optional: adding asterisks to indicate significance of standardized residuals
It will take a little bit of coding, but adding asterisks to indicate the level of significance of the standardized residuals may be helpful. By convention, a single asterisk is used to indicate a p-value < 0.05, a double asterisk is used to indicate p-value < 0.01, and so on.
StRes = chisq.test(Matrix)$stdres
StRes2 = matrix(" ", nrow=nrow(StRes), ncol=ncol(StRes),
dimnames=dimnames(StRes))
StRes2[abs(StRes) >= qnorm(1 - 0.10/2)] = "."
StRes2[abs(StRes) >= qnorm(1 - 0.05/2)] = "*"
StRes2[abs(StRes) >= qnorm(1 - 0.01/2)] = "**"
StRes2[abs(StRes) >= qnorm(1 - 0.001/2)] = "***"
StRes2[abs(StRes) >= qnorm(1 - 0.0001/2)] = "****"
StRes2
Pass Fail
Bloom "." "."
Cobblestone "**" "**"
Dougal "." "."
Heimlich "*" "*"
Pairwise tests
Another post-hoc approach is to examine component contingency tables in a pairwise manner. In this case, we can compare the results of each county to each other county, analyzing the 2 x 2 table for each pair of counties.
### Order matrix
Matrix2 = Matrix[(c("Heimlich", "Bloom",
"Dougal", "Cobblestone")),]
Matrix2
### Pairwise tests of association
library(rcompanion)
CT = pairwiseNominalIndependence(Matrix2,
compare = "row",
fisher = FALSE,
gtest = FALSE,
chisq = TRUE,
method = "fdr", # see ?p.adjust for options
digits = 3)
CT
Comparison p.Chisq p.adj.Chisq
1 Heimlich : Bloom 0.99000 0.99000
2 Heimlich : Dougal 0.01910 0.03820
3 Heimlich : Cobblestone 0.00154 0.00924
4 Bloom : Dougal 0.05590 0.08380
5 Bloom : Cobblestone 0.00707 0.02120
6 Dougal : Cobblestone 0.77000 0.92400
### Compact letter display
library(rcompanion)
cldList(p.adj.Chisq ~ Comparison, data=CT)
Group Letter MonoLetter
1 Heimlich a a
2 Bloom ab ab
3 Dougal bc bc
4 Cobblestone c c
The table of adjusted p-values can be summarized to a table of letters indicating which treatments are not significantly different.
County Percent passing Letter
Heimlich County 84% a
Bloom County 81 ab
Dougal County 47 bc
Cobblestone County 35 c
Counties sharing a letter are not significantly different by chi-square test of
association, with p-values adjusted by FDR method for multiple comparisons
(Benjamini–Hochberg false discovery rate).
Monte Carlo simulation
The pairwise chi square tests could also be conducted with Monte Carlo simulation.
library(rcompanion)
pairwiseNominalIndependence(Matrix2,
compare = "row",
fisher = FALSE,
gtest = FALSE,
chisq = TRUE,
simulate.p.value=TRUE, B=10000,
method = "fdr",
digits = 3)
Fisher Exact test of association
fisher.test(Matrix)
Fisher's Exact Test for Count Data
p-value = 0.000668
Monte Carlo simulation
fisher.test(Matrix, simulate.p.value=TRUE, B=10000)
Fisher's Exact Test for Count Data with simulated p-value
(based on 10000 replicates)
p-value = 0.0009999
### Values by Monte Carlo simulation may vary
Post-hoc analysis
Post-hoc analysis can be conducted with pairwise Fisher’s exact tests. In this case, each county is compared to each other county.
### Order matrix
Matrix2 = Matrix[(c("Heimlich", "Bloom",
"Dougal", "Cobblestone")),]
Matrix2
### Pairwise tests of association
library(rcompanion)
FT = pairwiseNominalIndependence(Matrix2,
compare = "row",
fisher = TRUE,
gtest = FALSE,
chisq = FALSE,
method = "fdr", # see ?p.adjust for options
digits = 3)
FT
Comparison p.Fisher p.adj.Fisher
1 Heimlich : Bloom 0.740000 0.74000
2 Heimlich : Dougal 0.013100 0.02620
3 Heimlich : Cobblestone 0.000994 0.00596
4 Bloom : Dougal 0.037600 0.05640
5 Bloom : Cobblestone 0.003960 0.01190
6 Dougal : Cobblestone 0.720000 0.74000
### Compact letter display
library(rcompanion)
cldList(p.adj.Fisher ~ Comparison,
data = FT,
threshold = 0.05)
Group Letter MonoLetter
1 Heimlich a a
2 Bloom ab ab
3 Dougal bc bc
4 Cobblestone c c
The table of adjusted p-values can be summarized to a table of letters indicating which treatments are not significantly different.
County Percent passing Letter
Heimlich County 84% a
Bloom County 81 ab
Dougal County 47 bc
Cobblestone County 35 c
Counties sharing a letter are not significantly different by Fisher exact test,
with p-values adjusted by FDR method for multiple comparisons
(Benjamini–Hochberg false discovery rate).
G-test of association
library(DescTools)
GTest(Matrix)
Log likelihood ratio (G-test) test of independence without correction
G = 17.14, X-squared df = 3, p-value = 0.0006615
Expected cell counts
The GTest function output includes the statistically expected counts for a contingency table.
One cell has an expected count below 5, but that all expected counts are at least 1, and that cells with expected counts below 5 are less than 20% of cells. (1 / 8 cells = 13%).
library(DescTools)
Test = GTest(Matrix)
Test$expected
Pass Fail
Bloom 17.62222 8.377778
Cobblestone 11.52222 5.477778
Dougal 10.16667 4.833333
Heimlich 21.68889 10.311111
Statistically expected counts
Post-hoc analysis
A post-hoc analysis can be conducted with pairwise G-tests. In this case, each county is compared to each other county.
### Order matrix
Matrix2 = Matrix[(c("Heimlich", "Bloom",
"Dougal", "Cobblestone")),]
Matrix2
### Pairwise tests of association
GT = pairwiseNominalIndependence(Matrix2,
compare = "row",
fisher = FALSE,
gtest = TRUE,
chisq = FALSE,
method = "fdr", # see ?p.adjust for options
digits = 3)
GT
Comparison p.Gtest p.adj.Gtest
1 Heimlich : Bloom 0.718000 0.71800
2 Heimlich : Dougal 0.008300 0.01660
3 Heimlich : Cobblestone 0.000506 0.00304
4 Bloom : Dougal 0.024800 0.03720
5 Bloom : Cobblestone 0.002380 0.00714
6 Dougal : Cobblestone 0.513000 0.61600
### Compact letter display
library(rcompanion)
cldList(p.adj.Gtest ~ Comparison,
data = GT,
threshold = 0.05)
Group Letter MonoLetter
1 Heimlich a a
2 Bloom a a
3 Dougal b b
4 Cobblestone b b
The table of adjusted p-values can be summarized to a table of letters indicating which treatments are not significantly different.
County Percent passing Letter
Heimlich County 84% a
Bloom County 81 a
Dougal County 47 b
Cobblestone County 35 b
Counties sharing a letter are not significantly different by G-test for
association, with p-values adjusted by FDR method for multiple comparisons
(Benjamini–Hochberg false discovery rate).
Optional: Exact tests for 2 x 2 tables
In addition to Fisher’s exact test, there are exact tests that are developed for 2 x 2 tables. Barnard’s and Boschloo’s tests are not conditioned on the table margin totals like Fisher’s exact test is. In general, these tests are slightly more powerful than Fisher’s exact test. However, to my knowledge there are not implementations for tables larger than 2 x 2.
TwoByTwo = as.matrix(read.table(header=TRUE, row.names=1, text="
Presenter Pass Fail
'Walter Dornez' 37 23
'Mina Harker' 25 35
"))
TwoByTwo
Pass Fail
Walter Dornez 37 23
Mina Harker 25 35
Effect size
phi
library(rcompanion)
phi(TwoByTwo)
phi
0.2
phi(TwoByTwo, ci=TRUE)
phi lower.ci upper.ci
1 0.2 0.0292 0.376
Odds ratio
OddsRatio = (37/23)/(25/35)
OddsRatio
[1] 2.252174
library(DescTools)
OddsRatio(TwoByTwo, conf.level=0.95)
odds ratio lwr.ci upr.ci
2.252174 1.084335 4.677787
Fisher’s exact test
For 2 x 2 tables, the fisher.test function reports an odds ratio statistic with confidence interval. This will not be the same as odds ratio computed above.
fisher.test(TwoByTwo)
Fisher's Exact Test for Count Data
p-value = 0.04405
95 percent confidence interval:
1.019688 4.995417
sample estimates:
odds ratio
2.236572
Coin package
library(coin)
TwoTable = as.table(TwoByTwo)
chisq_test(TwoTable, distribution="exact")
Exact Pearson Chi-Squared Test
chi-squared = 4.8053, p-value = 0.04405
Barnard’s test
Barnard’s test with the default “csm” method may take a long time to compute. On my current laptop this example took 42 seconds to execute.
library(DescTools)
BarnardTest(TwoByTwo)
CSM Exact Test
test statistic = NA, first sample size = 60, second sample size = 60,
p-value = 0.03534
Boschloo’s test
BarnardTest(TwoByTwo, method="boschloo")
Boschloo's Exact Test
test statistic = 0.044045, first sample size = 60, second sample size = 60,
p-value = 0.03342
Santner and Snell's test
BarnardTest(TwoByTwo, method="santner and snell")
Santner and Snell's Exact Test
test statistic = 0.2, first sample size = 60, second sample size = 60,
p-value = 0.03532
Optional: z-test for two proportions
Students often learn about the z-test for two proportions in introductory analysis of experiments courses. R doesn’t have a native function for this test. However, it’s equivalent to the chi-square test of association for a two-by-two table. Note here that the correct=FALSE option is used in the chisq.test() function and in the prop.test() function to make the results match the results from and the z-test for two proportions.
In this example, we’ll use the following contingency table of counts.
Matrix = matrix(c(21,16,43,48), byrow=TRUE, nrow=2)
Matrix
[,1] [,2]
[1,] 21 16
[2,] 43 48
colSums(Matrix)
# [1] 64 64
So, this corresponds to two proportions: 21 / 64 and 16 / 64.
Chi-square test for two proportions
prop.test(c(21, 16), n = c(64, 64), correct = FALSE)
2-sample test for equality of proportions without continuity correction
X-squared = 0.9504, df = 1, p-value = 0.3296
Chi-square test of association
Matrix = matrix(c(21,16,43,48), byrow=TRUE, nrow=2)
Matrix
chisq.test(Matrix, correct=FALSE)
Pearson's Chi-squared test
X-squared = 0.9504, df = 1, p-value = 0.3296
Z-test for two proportions
X1 = 21
X2 = 16
N1 = 64
N2 = 64
P1 = X1 / N1
P2 = X2 / N2
P = (X1 + X2) / (N1 + N2)
Q = 1 – P
Z = (P1 - P2) / sqrt(P * Q * ( (1/N1) + (1/N2) ) )
Z
0.9748851
Z value
Z ^ 2
0.950401
Note, same as chi-square value
2 * pnorm(abs(Z), lower.tail=FALSE)
0.3296173
p-value, same as in chi-square test
Optional readings
“Small numbers in chi-square and G–tests” in McDonald, J.H. 2014. Handbook
of Biological Statistics. www.biostathandbook.com/small.html.
References
“Fisher’s Exact Test of Independence” in Mangiafico, S.S. 2015a. An R Companion for the Handbook of Biological Statistics, version 1.09. rcompanion.org/rcompanion/b_07.html.
“G–test of Independence” in Mangiafico, S.S. 2015b. An R Companion for the Handbook of Biological Statistics, version 1.09. rcompanion.org/rcompanion/b_06.html.
“Chi-square Test of Independence” in Mangiafico, S.S. 2015c. An R Companion for the Handbook of Biological Statistics, version 1.09. rcompanion.org/rcompanion/b_05.html.
“Small Numbers in Chi-square and G–tests” in Mangiafico, S.S. 2015d. An R Companion for the Handbook of Biological Statistics, version 1.09. rcompanion.org/rcompanion/b_08.html.
Exercises M
1. Considering Alexander Anderson’s data,
a. Numerically, which county had the greatest percentage of
passing students?
b. Numerically, which county had the greatest number of total
students?
c. Was there an association between county and student success? Report the test used, why you chose this test, the p-value, and the conclusion.
d. What was the effect size? What statistic was used? What
is the interpretation of this value?
e. Statistically, which counties performed the best? Be sure to indicate which post-hoc test you are using.
f. Statistically, which performed the worst?
g. Plot the data in an appropriate way, and submit the plot.
h. Practically speaking, what are your conclusions? Consider all the information above that is relevant, and include descriptive statistics and observations about your plot that support your conclusions.
2. Ryuk and Rem held a workshop on planting habitat for pollinators like bees and butterflies. They wish to know if there is an association between the profession of the attendees and their willingness to undertake a conservation planting. The following are the data.
Will plant?
Profession Yes No
Homeowner 13 14
Landscaper 27 6
Farmer 7 19
NGO 6 6
For each of the following, answer the question, and show the output from the analyses you used to answer the question.
a. Numerically, which profession had the greatest percentage
of answering yes?
b. Numerically, which profession had the greatest number of
total attendees?
c. Was there an association between profession and willingness to undertake a conservation planting? Report the test used, why you chose this test, the p-value, and the conclusion.
d. What was the effect size? What statistic was used? What
is the interpretation of this value?
e. Statistically, which professions were most willing? Be sure to indicate which post-hoc test you are using.
f. Statistically, which were least willing?
g. Plot the data in an appropriate way, and submit the plot.
h. Practically speaking, what are your conclusions? Consider all the information above that is relevant, and include descriptive statistics and observations about your plot that support your conclusions.