The Friedman test determines if there are differences among groups for two-way data structured in a specific way, namely in an unreplicated complete block design. In this design, one variable serves as the treatment or group variable, and another variable serves as the blocking variable. It is the differences among treatments or groups that we are interested in. We aren’t necessarily interested in differences among blocks, but we want our statistics to take into account differences in the blocks. In the unreplicated complete block design, each block has one and only one observation of each treatment.
For an example of this structure, look at the Belcher family data below. Rater is considered the blocking variable, and each rater has one observation for each Instructor. The test will determine if there are differences among values for Instructor, taking into account any consistent effect of a Rater. For example, if Rater a rated consistently low and Rater g rated consistently high, the Friedman test can account for this statistically.
In other cases, the blocking variable might be the class where the ratings were done or the school where the ratings were done. If you were testing differences among curricula or other teaching treatments with different instructors, different instructors might be used as blocks.
Some people critique the Friedman test for having low power in detecting differences among groups. It has been suggested, however, that Friedman test may be powerful when there are five or more groups.
In general, you may want to choose a more powerful test. For an ordinal dependent variable, ordinal regression can be used, with the blocking variable being used as a random variable in the model. For a continuous dependent variable, the Quade test is an option, or aligned ranks transformation anova (ART anova) could be used, with the blocking variable being used as a random variable in the model.
Post-hoc tests
The outcome of the Friedman test tells you if there are differences among the groups, but doesn’t tell you which groups are different from other groups. In order to determine which groups are different from others, post-hoc testing can be conducted. Several are presented here.
Appropriate data
• Two-way data arranged in an unreplicated complete block design
• Dependent variable is ordinal, interval, or ratio
• Treatment or group independent variable is a factor with two or more levels. That is, two or more groups
• Blocking variable is a factor with two or more levels
• Blocks are independent of each other and have no interaction with treatments
Hypotheses
• Null hypothesis: The distributions of values for each group are equal.
• Alternative hypothesis (two-sided): There is systematic difference in the distribution of values for the groups.
Interpretation
Significant results can be reported as “There was a significant difference in values among groups.”
Other notes and alternative tests
The Quade test is used for the same kinds of data and hypotheses, but can be more powerful in some cases. It has been suggested that Friedman test may be preferable when there are a larger number of groups (five or more), while the Quade is preferable for fewer groups. The Quade test is described in the next chapter.
Cumulative link models for ordinal data (ordinal regression) are appropriate when the dependent variable is ordinal. Otherwise, aligned ranks transformation anova may be appropriate. Either of these approaches allows for more flexibility in design than the Friedman or Quade tests.
If the unreplicated block design is partially incomplete, the Skillings–Mack test can be used.
Packages used in this chapter
The packages used in this chapter include:
• psych
• FSA
• lattice
• coin
• PMCMRplus
• rcompanion
• DescTools
The following commands will install these packages if they are not already installed:
if(!require(psych)){install.packages("psych")}
if(!require(FSA)){install.packages("FSA")}
if(!require(lattice)){install.packages("lattice")}
if(!require(coin)){install.packages("coin")}
if(!require(PMCMRplus)){install.packages("PMCMRplus")}
if(!require(rcompanion)){install.packages("rcompanion")}
if(!require(DescTools)){install.packages("DescTools")}
Friedman test example
Data = read.table(header=TRUE, stringsAsFactors=TRUE, text="
Instructor Rater Likert
'Bob Belcher' a 4
'Bob Belcher' b 5
'Bob Belcher' c 4
'Bob Belcher' d 6
'Bob Belcher' e 6
'Bob Belcher' f 6
'Bob Belcher' g 10
'Bob Belcher' h 6
'Linda Belcher' a 8
'Linda Belcher' b 6
'Linda Belcher' c 8
'Linda Belcher' d 8
'Linda Belcher' e 8
'Linda Belcher' f 7
'Linda Belcher' g 10
'Linda Belcher' h 9
'Tina Belcher' a 7
'Tina Belcher' b 5
'Tina Belcher' c 7
'Tina Belcher' d 8
'Tina Belcher' e 8
'Tina Belcher' f 9
'Tina Belcher' g 10
'Tina Belcher' h 9
'Gene Belcher' a 6
'Gene Belcher' b 4
'Gene Belcher' c 5
'Gene Belcher' d 5
'Gene Belcher' e 6
'Gene Belcher' f 6
'Gene Belcher' g 5
'Gene Belcher' h 5
'Louise Belcher' a 8
'Louise Belcher' b 7
'Louise Belcher' c 8
'Louise Belcher' d 8
'Louise Belcher' e 9
'Louise Belcher' f 9
'Louise Belcher' g 8
'Louise Belcher' h 10
")
### Order levels of the factor; otherwise R will
alphabetize them
Data$Instructor = factor(Data$Instructor,
levels=unique(Data$Instructor))
### Create a new variable which is the likert
scores as an ordered factor
Data$Likert.f = factor(Data$Likert,
ordered=TRUE)
### Check the data frame
library(psych)
headTail(Data)
str(Data)
summary(Data)
Summarize data treating Likert scores as factors
xtabs( ~ Instructor + Likert.f,
data = Data)
Likert.f
Instructor 4 5 6 7 8 9 10
Bob Belcher 2 1 4 0 0 0 1
Linda Belcher 0 0 1 1 4 1 1
Tina Belcher 0 1 0 2 2 2 1
Gene Belcher 1 4 3 0 0 0 0
Louise Belcher 0 0 0 1 4 2 1
XT = xtabs( ~ Instructor + Likert.f,
data = Data)
prop.table(XT,
margin = 1)
Likert.f
Instructor 4 5 6 7 8 9 10
Bob Belcher 0.250 0.125 0.500 0.000 0.000 0.000 0.125
Linda Belcher 0.000 0.000 0.125 0.125 0.500 0.125 0.125
Tina Belcher 0.000 0.125 0.000 0.250 0.250 0.250 0.125
Gene Belcher 0.125 0.500 0.375 0.000 0.000 0.000 0.000
Louise Belcher 0.000 0.000 0.000 0.125 0.500 0.250 0.125
Bar plots by group
Note that the bar plots don’t show the effect of the blocking variable.
library(lattice)
histogram(~ Likert.f | Instructor,
data=Data,
layout=c(1,5),
col="darkgray")
#### (1,5) indicates the columns and rows for the plots
Summarize data treating Likert scores as numeric
library(FSA)
Summarize(Likert ~ Instructor,
data=Data,
digits=3)
Instructor n mean sd min Q1 median Q3 max percZero
1 Bob Belcher 8 5.875 1.885 4 4.75 6 6.00 10 0
2 Linda Belcher 8 8.000 1.195 6 7.75 8 8.25 10 0
3 Tina Belcher 8 7.875 1.553 5 7.00 8 9.00 10 0
4 Gene Belcher 8 5.250 0.707 4 5.00 5 6.00 6 0
5 Louise Belcher 8 8.375 0.916 7 8.00 8 9.00 10 0
Friedman test example
This example uses the formula notation indicating that Likert is the dependent variable, Instructor is the independent variable, and Rater is the blocking variable. The data= option indicates the data frame that contains the variables. For the meaning of other options, see ?friedman.test or documentation for other employed functions.
friedman.test(Likert ~ Instructor | Rater,
data = Data)
Friedman rank sum test
Friedman chi-squared = 23.139, df = 4, p-value = 0.0001188
library(coin)
friedman_test(Likert ~ Instructor | Rater,
data = Data)
Asymptotic Friedman Test
chi-squared = 23.139, df = 4, p-value = 0.0001188
library(PMCMRplus)
friedmanTest(y = Data$Likert,
groups = Data$Instructor,
blocks = Data$Rater)
Friedman rank sum test
Friedman chi-squared = 23.139, df = 4, p-value = 0.0001188
Effect size
Kendall’s W, or Kendall’s coefficient of concordance, can be used as an effect size statistic for Friedman’s test.
The following interpretations are based on personal intuition. They are not intended to be universal.
|
|
small
|
medium |
large |
Kendall’s W |
k = 3 |
< 0.10 |
0.10 – < 0.30 |
≥ 0.30 |
|
k = 5 |
< 0.10 |
0.10 – < 0.25 |
≥ 0.25 |
|
k = 7 |
< 0.10 |
0.10 – < 0.20 |
≥ 0.20 |
|
k = 9 |
< 0.10 |
0.10 – < 0.20 |
≥ 0.20 |
XT = xtabs(Likert ~ Instructor + Rater,
data = Data)
XT
Instructor a b c d e f g h
Bob Belcher 4 5 4 6 6 6 10 6
Linda Belcher 8 6 8 8 8 7 10 9
Tina Belcher 7 5 7 8 8 9 10 9
Gene Belcher 6 4 5 5 6 6 5 5
Louise Belcher 8 7 8 8 9 9 8 10
For the KendallW function, groups must be in rows, and raters must be in columns.
library(DescTools)
KendallW(XT,
correct=TRUE,
test=TRUE)
Kendall's coefficient of concordance Wt
Kendall chi-squared = 23.139, df = 4, subjects = 5, raters = 8,
p-value = 0.0001188
sample estimates:
Wt
0.7230903
In the output above, check that the correct number of groups and raters is listed under "subjects" and "raters", respectively.
library(rcompanion)
kendallW(XT, correct=TRUE)
W
0.723
kendallW(XT, correct=TRUE, ci=TRUE)
W lower.ci upper.ci
1 0.723 0.547 0.917
### Confidence intervals by bootstrap may vary
Post-hoc tests
Conover test
### Order groups by median
Data$Instructor = factor(Data$Instructor,
levels = c("Linda Belcher", "Louise
Belcher",
"Tina Belcher", "Bob
Belcher",
"Gene Belcher"))
library(PMCMRplus)
CT = frdAllPairsConoverTest(y = Data$Likert,
groups = Data$Instructor,
blocks = Data$Rater,
p.adjust.method="single-step")
CT
Pairwise comparisons using Conover's all-pairs test for a two-way balanced
complete block design
Linda Belcher Louise Belcher Tina Belcher Bob Belcher
Louise Belcher 0.9794 - - -
Tina Belcher 0.9884 0.8278 - -
Bob Belcher 0.0853 0.0169 0.2490 -
Gene Belcher 0.0099 0.0012 0.0447 0.9489
P value adjustment method: single-step
library(rcompanion)
CTT =PMCMRTable(CT)
CTT
Comparison p.value
1 Louise Belcher - Linda Belcher = 0 0.979
2 Tina Belcher - Linda Belcher = 0 0.988
3 Bob Belcher - Linda Belcher = 0 0.0853
4 Gene Belcher - Linda Belcher = 0 0.00993
5 Tina Belcher - Louise Belcher = 0 0.828
6 Bob Belcher - Louise Belcher = 0 0.0169
7 Gene Belcher - Louise Belcher = 0 0.00123
8 Bob Belcher - Tina Belcher = 0 0.249
9 Gene Belcher - Tina Belcher = 0 0.0447
10 Gene Belcher - Bob Belcher = 0 0.949
library(rcompanion)
cldList(p.value ~ Comparison, data = CTT)
Group Letter MonoLetter
1 LouiseBelcher a a
2 TinaBelcher ab ab
3 BobBelcher bc bc
4 GeneBelcher c c
5 LindaBelcher ab ab
Exact test
library(PMCMRplus)
ET = frdAllPairsExactTest(y = Data$Likert,
groups = Data$Instructor,
blocks = Data$Rater,
p.adjust.method="fdr")
ET
Pairwise comparisons using Eisinga, Heskes, Pelzer & Te Grotenhuis
all-pairs test with exact p-values for a two-way balanced complete block design
data: y, groups and blocks
Linda Belcher Louise Belcher Tina Belcher Bob Belcher
Louise Belcher 0.65081 - - -
Tina Belcher 0.69729 0.44188 - -
Bob Belcher 0.02456 0.00768 0.07761 -
Gene Belcher 0.00601 0.00047 0.01833 0.60368
P value adjustment method: fdr
library(rcompanion)
ETT =PMCMRTable(ET)
ETT
Comparison p.value
1 Louise Belcher - Linda Belcher = 0 0.651
2 Tina Belcher - Linda Belcher = 0 0.697
3 Bob Belcher - Linda Belcher = 0 0.0246
4 Gene Belcher - Linda Belcher = 0 0.00601
5 Tina Belcher - Louise Belcher = 0 0.442
6 Bob Belcher - Louise Belcher = 0 0.00768
7 Gene Belcher - Louise Belcher = 0 0.000467
8 Bob Belcher - Tina Belcher = 0 0.0776
9 Gene Belcher - Tina Belcher = 0 0.0183
10 Gene Belcher - Bob Belcher = 0 0.604
library(rcompanion)
cldList(p.value ~ Comparison, data = ETT)
Group Letter MonoLetter
1 LouiseBelcher a a
2 TinaBelcher ab ab
3 BobBelcher bc bc
4 GeneBelcher c c
5 LindaBelcher a a
Nemenyi test
library(PMCMRplus)
NT = frdAllPairsNemenyiTest(Likert ~ Instructor | Rater, data = Data)
NT
Pairwise comparisons using Nemenyi-Wilcoxon-Wilcox all-pairs test for a two-way
balanced complete block design
Linda Belcher Louise Belcher Tina Belcher Bob Belcher
Louise Belcher 0.9816 - - -
Tina Belcher 0.9897 0.8426 - -
Bob Belcher 0.1021 0.0224 0.2775 -
Gene Belcher 0.0136 0.0019 0.0557 0.9540
P value adjustment method: single-step
library(rcompanion)
NTT =PMCMRTable(NT)
NTT
Comparison p.value
1 Louise Belcher - Linda Belcher = 0 0.982
2 Tina Belcher - Linda Belcher = 0 0.99
3 Bob Belcher - Linda Belcher = 0 0.102
4 Gene Belcher - Linda Belcher = 0 0.0136
5 Tina Belcher - Louise Belcher = 0 0.843
6 Bob Belcher - Louise Belcher = 0 0.0224
7 Gene Belcher - Louise Belcher = 0 0.00189
8 Bob Belcher - Tina Belcher = 0 0.278
9 Gene Belcher - Tina Belcher = 0 0.0557
10 Gene Belcher - Bob Belcher = 0 0.954
library(rcompanion)
cldList(p.value ~ Comparison, data = NTT)
Group Letter MonoLetter
1 LouiseBelcher a a
2 TinaBelcher abc abc
3 BobBelcher bc bc
4 GeneBelcher b b
5 LindaBelcher ac a c
Siegel test
library(PMCMRplus)
ST = frdAllPairsSiegelTest(y = Data$Likert,
groups = Data$Instructor,
blocks = Data$Rater,
p.adjust.method="fdr")
ST
Pairwise comparisons using Siegel-Castellan all-pairs test for a two-way
balanced complete block design
Linda Belcher Louise Belcher Tina Belcher Bob Belcher
Louise Belcher 0.6353 - - -
Tina Belcher 0.6353 0.4344 - -
Bob Belcher 0.0285 0.0089 0.0802 -
Gene Belcher 0.0078 0.0020 0.0180 0.5960
P value adjustment method: fdr
library(rcompanion)
STT =PMCMRTable(ST)
STT
Comparison p.value
1 Louise Belcher - Linda Belcher = 0 0.635
2 Tina Belcher - Linda Belcher = 0 0.635
3 Bob Belcher - Linda Belcher = 0 0.0285
4 Gene Belcher - Linda Belcher = 0 0.00783
5 Tina Belcher - Louise Belcher = 0 0.434
6 Bob Belcher - Louise Belcher = 0 0.00888
7 Gene Belcher - Louise Belcher = 0 0.00203
8 Bob Belcher - Tina Belcher = 0 0.0802
9 Gene Belcher - Tina Belcher = 0 0.018
10 Gene Belcher - Bob Belcher = 0 0.596
library(rcompanion)
cldList(p.value ~ Comparison, data = STT)
Group Letter MonoLetter
1 LouiseBelcher a a
2 TinaBelcher ab ab
3 BobBelcher bc bc
4 GeneBelcher c c
5 LindaBelcher a a
Example from Conover
This example is taken from the Friedman test section of Conover (1999).
Conover1 = read.table(header=TRUE, stringsAsFactors=TRUE, text="
Homeowner Grass1 Grass2 Grass3 Grass4
1 4 3 2 1
2 4 2 3 1
3 3 1.5 1.5 4
4 3 1 2 4
5 4 2 1 3
6 2 2 2 4
7 1 3 2 4
8 2 4 1 3
9 3.5 1 2 3.5
10 4 1 3 2
11 4 2 3 1
12 3.5 1 2 3.5
")
if(!require(tidyr)){install.packages("tidyr")}
library(tidyr)
Conover = gather(Conover1, Grass, Rating, Grass1:Grass4, factor_key=TRUE)
### Check the data frame
library(psych)
headTail(Conover)
str(Conover)
summary(Conover)
### Friedman test
friedman.test(Rating ~ Grass | Homeowner,
data = Conover)
Friedman rank sum test
Friedman chi-squared = 8.0973, df = 3, p-value = 0.04404
GT = xtabs(Rating ~ Grass + Homeowner,
data = Conover)
GT
Homeowner
Grass 1 2 3 4 5 6 7 8 9 10 11 12
Grass1 4.0 4.0 3.0 3.0 4.0 2.0 1.0 2.0 3.5 4.0 4.0 3.5
Grass2 3.0 2.0 1.5 1.0 2.0 2.0 3.0 4.0 1.0 1.0 2.0 1.0
Grass3 2.0 3.0 1.5 2.0 1.0 2.0 2.0 1.0 2.0 3.0 3.0 2.0
Grass4 1.0 1.0 4.0 4.0 3.0 4.0 4.0 3.0 3.5 2.0 1.0 3.5
library(DescTools)
KendallW(GT, correct=TRUE, test=TRUE)
Kendall's coefficient of concordance Wt
Kendall chi-squared = 8.0973, df = 3, subjects = 4, raters = 12, p-value =
0.04404
sample estimates:
Wt
0.2249263
library(PMCMRplus)
frdAllPairsExactTest(y = Conover$Rating,
groups = Conover$Grass,
blocks = Conover$Homeowner,
p.adjust.method = "fdr")
Pairwise comparisons using Eisinga, Heskes, Pelzer & Te Grotenhuis
all-pairs test with exact p-values for a two-way balanced complete block design
Grass1 Grass2 Grass3
Grass2 0.094 - -
Grass3 0.094 0.938 -
Grass4 0.701 0.194 0.201
P value adjustment method: fdr
References
Conover, W.J. 1999. Practical Nonparametric Statistics, 3rd. John Wiley & Sons.