[banner]

Summary and Analysis of Extension Program Evaluation in R

Salvatore S. Mangiafico

Friedman Test

The Friedman test determines if there are differences among groups for two-way data structured in a specific way, namely in an unreplicated complete block design.  In this design, one variable serves as the treatment or group variable, and another variable serves as the blocking variable.  It is the differences among treatments or groups that we are interested in.  We aren’t necessarily interested in differences among blocks, but we want our statistics to take into account differences in the blocks.  In the unreplicated complete block design, each block has one and only one observation of each treatment. 

 

For an example of this structure, look at the Belcher family data below.  Rater is considered the blocking variable, and each rater has one observation for each Instructor.  The test will determine if there are differences among values for Instructor, taking into account any consistent effect of a Rater.  For example, if Rater a rated consistently low and Rater g rated consistently high, the Friedman test can account for this statistically. 

 

In other cases, the blocking variable might be the class where the ratings were done or the school where the ratings were done.  If you were testing differences among curricula or other teaching treatments with different instructors, different instructors might be used as blocks.

 

Some people critique the Friedman test for having low power in detecting differences among groups.  It has been suggested, however, that Friedman test may be powerful when there are five or more groups.

 

In general, you may want to choose a more powerful test.  For an ordinal dependent variable, ordinal regression can be used, with the blocking variable being used as a random variable in the model.  For a continuous dependent variable, the Quade test is an option, or aligned ranks transformation anova (ART anova) could be used, with the blocking variable being used as a random variable in the model.

 

Post-hoc tests

The outcome of the Friedman test tells you if there are differences among the groups, but doesn’t tell you which groups are different from other groups.  In order to determine which groups are different from others, post-hoc testing can be conducted.  Several are presented here.

 

Appropriate data

•  Two-way data arranged in an unreplicated complete block design

•  Dependent variable is ordinal, interval, or ratio

•  Treatment or group independent variable is a factor with two or more levels.  That is, two or more groups

•  Blocking variable is a factor with two or more levels

•  Blocks are independent of each other and have no interaction with treatments

 

Hypotheses

•  Null hypothesis:  The distributions of values for each group are equal.

•  Alternative hypothesis (two-sided): There is systematic difference in the distribution of values for the groups.

 

Interpretation

Significant results can be reported as “There was a significant difference in values among groups.”

 

Other notes and alternative tests

The Quade test is used for the same kinds of data and hypotheses, but can be more powerful in some cases.  It has been suggested that Friedman test may be preferable when there are a larger number of groups (five or more), while the Quade is preferable for fewer groups.  The Quade test is described in the next chapter.

 

Cumulative link models for ordinal data (ordinal regression) are appropriate when the dependent variable is ordinal.  Otherwise, aligned ranks transformation anova may be appropriate.  Either of these approaches allows for more flexibility in design than the Friedman or Quade tests.

 

If the unreplicated block design is partially incomplete, the Skillings–Mack test can be used.

 

Packages used in this chapter

 

The packages used in this chapter include:

•  psych

•  FSA

•  lattice

•  coin

•  PMCMRplus

•  rcompanion

•  DescTools

 

The following commands will install these packages if they are not already installed:


if(!require(psych)){install.packages("psych")}
if(!require(FSA)){install.packages("FSA")}
if(!require(lattice)){install.packages("lattice")}
if(!require(coin)){install.packages("coin")}

if(!require(PMCMRplus)){install.packages("PMCMRplus")}
if(!require(rcompanion)){install.packages("rcompanion")}

if(!require(DescTools)){install.packages("DescTools")}

Friedman test example


Data = read.table(header=TRUE, stringsAsFactors=TRUE, text="

 Instructor        Rater  Likert
 'Bob Belcher'        a      4
 'Bob Belcher'        b      5
 'Bob Belcher'        c      4
 'Bob Belcher'        d      6
 'Bob Belcher'        e      6
 'Bob Belcher'        f      6
 'Bob Belcher'        g     10
 'Bob Belcher'        h      6
 'Linda Belcher'      a      8
 'Linda Belcher'      b      6
 'Linda Belcher'      c      8
 'Linda Belcher'      d      8
 'Linda Belcher'      e      8
 'Linda Belcher'      f      7
 'Linda Belcher'      g     10
 'Linda Belcher'      h      9
 'Tina Belcher'       a      7
 'Tina Belcher'       b      5
 'Tina Belcher'       c      7
 'Tina Belcher'       d      8
 'Tina Belcher'       e      8
 'Tina Belcher'       f      9
 'Tina Belcher'       g     10
 'Tina Belcher'       h      9
 'Gene Belcher'       a      6
 'Gene Belcher'       b      4
 'Gene Belcher'       c      5
 'Gene Belcher'       d      5
 'Gene Belcher'       e      6
 'Gene Belcher'       f      6
 'Gene Belcher'       g      5
 'Gene Belcher'       h      5
 'Louise Belcher'     a      8
 'Louise Belcher'     b      7
 'Louise Belcher'     c      8
 'Louise Belcher'     d      8
 'Louise Belcher'     e      9
 'Louise Belcher'     f      9
 'Louise Belcher'     g      8
 'Louise Belcher'     h     10             
")

### Order levels of the factor; otherwise R will alphabetize them

Data$Instructor = factor(Data$Instructor,
                      levels=unique(Data$Instructor))

### Create a new variable which is the likert scores as an ordered factor

Data$Likert.f = factor(Data$Likert,
                          ordered=TRUE)


###  Check the data frame


library(psych)

headTail(Data)

str(Data)

summary(Data)


Summarize data treating Likert scores as factors


xtabs( ~ Instructor + Likert.f,
      data = Data)


                Likert.f
Instructor       4 5 6 7 8 9 10
  Bob Belcher    2 1 4 0 0 0  1
  Linda Belcher  0 0 1 1 4 1  1
  Tina Belcher   0 1 0 2 2 2  1
  Gene Belcher   1 4 3 0 0 0  0
  Louise Belcher 0 0 0 1 4 2  1


XT = xtabs( ~ Instructor + Likert.f,
           data = Data)

prop.table(XT,
           margin = 1)


                Likert.f
Instructor           4     5     6     7     8     9    10
  Bob Belcher    0.250 0.125 0.500 0.000 0.000 0.000 0.125
  Linda Belcher  0.000 0.000 0.125 0.125 0.500 0.125 0.125
  Tina Belcher   0.000 0.125 0.000 0.250 0.250 0.250 0.125
  Gene Belcher   0.125 0.500 0.375 0.000 0.000 0.000 0.000
  Louise Belcher 0.000 0.000 0.000 0.125 0.500 0.250 0.125


Bar plots by group

Note that the bar plots don’t show the effect of the blocking variable.


library(lattice)

histogram(~ Likert.f | Instructor,
          data=Data,
          layout=c(1,5),
          col="darkgray")
   
  ####  (1,5) indicates the columns and rows for the plots


image


Summarize data treating Likert scores as numeric


library(FSA)

Summarize(Likert ~ Instructor,
          data=Data,
          digits=3)


      Instructor n  mean    sd min   Q1 median   Q3 max percZero
1    Bob Belcher 8 5.875 1.885   4 4.75      6 6.00  10        0
2  Linda Belcher 8 8.000 1.195   6 7.75      8 8.25  10        0
3   Tina Belcher 8 7.875 1.553   5 7.00      8 9.00  10        0
4   Gene Belcher 8 5.250 0.707   4 5.00      5 6.00   6        0
5 Louise Belcher 8 8.375 0.916   7 8.00      8 9.00  10        0


Friedman test example

This example uses the formula notation indicating that Likert is the dependent variable, Instructor is the independent variable, and Rater is the blocking variable.  The data= option indicates the data frame that contains the variables.  For the meaning of other options, see ?friedman.test or documentation for other employed functions.


friedman.test(Likert ~ Instructor | Rater,
              data = Data)


Friedman rank sum test

Friedman chi-squared = 23.139, df = 4, p-value = 0.0001188


library(coin)

friedman_test(Likert ~ Instructor | Rater,
              data = Data)


Asymptotic Friedman Test

chi-squared = 23.139, df = 4, p-value = 0.0001188


library(PMCMRplus)

friedmanTest(y      = Data$Likert,
             groups = Data$Instructor,
             blocks = Data$Rater)


Friedman rank sum test

Friedman chi-squared = 23.139, df = 4, p-value = 0.0001188


 

Effect size

Kendall’s W, or Kendall’s coefficient of concordance, can be used as an effect size statistic for Friedman’s test.

 

The following interpretations are based on personal intuition. They are not intended to be universal.

 

 

 

 

small

 

medium

large

Kendall’s W

k = 3

< 0.10

0.10  – < 0.30

≥ 0.30

 

k = 5

< 0.10

0.10  – < 0.25

≥ 0.25

 

k = 7

< 0.10

0.10  – < 0.20

≥ 0.20

 

k = 9

< 0.10

0.10  – < 0.20

≥ 0.20

 


XT = xtabs(Likert ~ Instructor + Rater,
           data = Data)

XT


Instructor        a  b  c  d  e  f  g  h
  Bob Belcher     4  5  4  6  6  6 10  6
  Linda Belcher   8  6  8  8  8  7 10  9
  Tina Belcher    7  5  7  8  8  9 10  9
  Gene Belcher    6  4  5  5  6  6  5  5
  Louise Belcher  8  7  8  8  9  9  8 10


For the KendallW function, groups must be in rows, and raters must be in columns.


library(DescTools)

KendallW(XT,
         correct=TRUE,
         test=TRUE)


Kendall's coefficient of concordance Wt

Kendall chi-squared = 23.139, df = 4, subjects = 5, raters = 8,
p-value = 0.0001188

sample estimates:
       Wt
0.7230903


In the output above, check that the correct number of groups and raters is listed under "subjects" and "raters", respectively.

 

library(rcompanion)

kendallW(XT, correct=TRUE)


    W
0.723


kendallW(XT, correct=TRUE, ci=TRUE)


      W lower.ci upper.ci
1 0.723    0.547    0.917

   ###  Confidence intervals by bootstrap may vary


Post-hoc tests

 

Conover test


### Order groups by median


Data$Instructor = factor(Data$Instructor,
                   levels = c("Linda Belcher", "Louise Belcher",
                              "Tina Belcher", "Bob Belcher",
                              "Gene Belcher"))


library(PMCMRplus)

CT = frdAllPairsConoverTest(y      = Data$Likert,
                            groups = Data$Instructor,
                            blocks = Data$Rater,
                            p.adjust.method="single-step")


CT


Pairwise comparisons using Conover's all-pairs test for a two-way balanced complete block design

               Linda Belcher Louise Belcher Tina Belcher Bob Belcher
Louise Belcher 0.9794        -              -            -         
Tina Belcher   0.9884        0.8278         -            -         
Bob Belcher    0.0853        0.0169         0.2490       -         
Gene Belcher   0.0099        0.0012         0.0447       0.9489    

P value adjustment method: single-step


library(rcompanion)

CTT =PMCMRTable(CT)

CTT


                           Comparison p.value
1  Louise Belcher - Linda Belcher = 0   0.979
2    Tina Belcher - Linda Belcher = 0   0.988
3     Bob Belcher - Linda Belcher = 0  0.0853
4    Gene Belcher - Linda Belcher = 0 0.00993
5   Tina Belcher - Louise Belcher = 0   0.828
6    Bob Belcher - Louise Belcher = 0  0.0169
7   Gene Belcher - Louise Belcher = 0 0.00123
8      Bob Belcher - Tina Belcher = 0   0.249
9     Gene Belcher - Tina Belcher = 0  0.0447
10     Gene Belcher - Bob Belcher = 0   0.949


library(rcompanion)

cldList(p.value ~ Comparison, data = CTT)


          Group Letter MonoLetter
1 LouiseBelcher      a        a 
2   TinaBelcher     ab        ab
3    BobBelcher     bc         bc
4   GeneBelcher      c          c
5  LindaBelcher     ab        ab


Exact test


library(PMCMRplus)

ET = frdAllPairsExactTest(y      = Data$Likert,
                          groups = Data$Instructor,
                          blocks = Data$Rater,
                          p.adjust.method="fdr")


ET


Pairwise comparisons using Eisinga, Heskes, Pelzer & Te Grotenhuis all-pairs test with exact p-values for a two-way balanced complete block design

data: y, groups and blocks

               Linda Belcher Louise Belcher Tina Belcher Bob Belcher
Louise Belcher 0.65081       -              -            -         
Tina Belcher   0.69729       0.44188        -            -         
Bob Belcher    0.02456       0.00768        0.07761      -         
Gene Belcher   0.00601       0.00047        0.01833      0.60368   

P value adjustment method: fdr


library(rcompanion)

ETT =PMCMRTable(ET)

ETT


                           Comparison  p.value
1  Louise Belcher - Linda Belcher = 0    0.651
2    Tina Belcher - Linda Belcher = 0    0.697
3     Bob Belcher - Linda Belcher = 0   0.0246
4    Gene Belcher - Linda Belcher = 0  0.00601
5   Tina Belcher - Louise Belcher = 0    0.442
6    Bob Belcher - Louise Belcher = 0  0.00768
7   Gene Belcher - Louise Belcher = 0 0.000467
8      Bob Belcher - Tina Belcher = 0   0.0776
9     Gene Belcher - Tina Belcher = 0   0.0183
10     Gene Belcher - Bob Belcher = 0    0.604


library(rcompanion)

cldList(p.value ~ Comparison, data = ETT)


          Group Letter MonoLetter
1 LouiseBelcher      a        a 
2   TinaBelcher     ab        ab
3    BobBelcher     bc         bc
4   GeneBelcher      c          c
5  LindaBelcher      a        a 


Nemenyi test


library(PMCMRplus)

NT = frdAllPairsNemenyiTest(Likert ~ Instructor | Rater, data = Data)


NT


Pairwise comparisons using Nemenyi-Wilcoxon-Wilcox all-pairs test for a two-way balanced complete block design

               Linda Belcher Louise Belcher Tina Belcher Bob Belcher
Louise Belcher 0.9816        -              -            -         
Tina Belcher   0.9897        0.8426         -            -         
Bob Belcher    0.1021        0.0224         0.2775       -         
Gene Belcher   0.0136        0.0019         0.0557       0.9540    

P value adjustment method: single-step


library(rcompanion)

NTT =PMCMRTable(NT)

NTT


                           Comparison p.value
1  Louise Belcher - Linda Belcher = 0   0.982
2    Tina Belcher - Linda Belcher = 0    0.99
3     Bob Belcher - Linda Belcher = 0   0.102
4    Gene Belcher - Linda Belcher = 0  0.0136
5   Tina Belcher - Louise Belcher = 0   0.843
6    Bob Belcher - Louise Belcher = 0  0.0224
7   Gene Belcher - Louise Belcher = 0 0.00189
8      Bob Belcher - Tina Belcher = 0   0.278
9     Gene Belcher - Tina Belcher = 0  0.0557
10     Gene Belcher - Bob Belcher = 0   0.954


library(rcompanion)

cldList(p.value ~ Comparison, data = NTT)


          Group Letter MonoLetter
1 LouiseBelcher      a        a 
2   TinaBelcher    abc        abc
3    BobBelcher     bc         bc
4   GeneBelcher      b         b
5  LindaBelcher     ac        a c


Siegel test


library(PMCMRplus)

ST = frdAllPairsSiegelTest(y      = Data$Likert,
                           groups = Data$Instructor,
                            blocks = Data$Rater,
                          p.adjust.method="fdr")


ST


Pairwise comparisons using Siegel-Castellan all-pairs test for a two-way balanced complete block design

               Linda Belcher Louise Belcher Tina Belcher Bob Belcher
Louise Belcher 0.6353        -              -            -         
Tina Belcher   0.6353        0.4344         -            -         
Bob Belcher    0.0285        0.0089         0.0802       -         
Gene Belcher   0.0078        0.0020         0.0180       0.5960    

P value adjustment method: fdr


library(rcompanion)

STT =PMCMRTable(ST)

STT


                           Comparison p.value
1  Louise Belcher - Linda Belcher = 0   0.635
2    Tina Belcher - Linda Belcher = 0   0.635
3     Bob Belcher - Linda Belcher = 0  0.0285
4    Gene Belcher - Linda Belcher = 0 0.00783
5   Tina Belcher - Louise Belcher = 0   0.434
6    Bob Belcher - Louise Belcher = 0 0.00888
7   Gene Belcher - Louise Belcher = 0 0.00203
8      Bob Belcher - Tina Belcher = 0  0.0802
9     Gene Belcher - Tina Belcher = 0   0.018
10     Gene Belcher - Bob Belcher = 0   0.596


library(rcompanion)

cldList(p.value ~ Comparison, data = STT)


          Group Letter MonoLetter
1 LouiseBelcher      a        a 
2   TinaBelcher     ab        ab
3    BobBelcher     bc         bc
4   GeneBelcher      c          c
5  LindaBelcher      a        a 


Example from Conover

 

This example is taken from the Friedman test section of Conover (1999).


Conover1 = read.table(header=TRUE, stringsAsFactors=TRUE, text="

Homeowner Grass1 Grass2 Grass3 Grass4
 1        4      3      2      1
 2        4      2      3      1
 3        3      1.5    1.5    4
 4        3      1      2      4
 5        4      2      1      3
 6        2      2      2      4
 7        1      3      2      4
 8        2      4      1      3
 9        3.5    1      2      3.5
10        4      1      3      2
11        4      2      3      1
12        3.5    1      2      3.5
")

if(!require(tidyr)){install.packages("tidyr")}

library(tidyr)

Conover = gather(Conover1, Grass, Rating, Grass1:Grass4, factor_key=TRUE)


###  Check the data frame


library(psych)

headTail(Conover)

str(Conover)

summary(Conover)


###  Friedman test

friedman.test(Rating ~ Grass | Homeowner,
              data = Conover)


Friedman rank sum test

Friedman chi-squared = 8.0973, df = 3, p-value = 0.04404



GT = xtabs(Rating ~ Grass + Homeowner,
           data = Conover)

GT


        Homeowner
Grass      1   2   3   4   5   6   7   8   9  10  11  12
  Grass1 4.0 4.0 3.0 3.0 4.0 2.0 1.0 2.0 3.5 4.0 4.0 3.5
  Grass2 3.0 2.0 1.5 1.0 2.0 2.0 3.0 4.0 1.0 1.0 2.0 1.0
  Grass3 2.0 3.0 1.5 2.0 1.0 2.0 2.0 1.0 2.0 3.0 3.0 2.0
  Grass4 1.0 1.0 4.0 4.0 3.0 4.0 4.0 3.0 3.5 2.0 1.0 3.5


library(DescTools)

KendallW(GT, correct=TRUE, test=TRUE)


Kendall's coefficient of concordance Wt

Kendall chi-squared = 8.0973, df = 3, subjects = 4, raters = 12, p-value = 0.04404

sample estimates:
       Wt
0.2249263


library(PMCMRplus)

frdAllPairsExactTest(y      = Conover$Rating,
                     groups = Conover$Grass,
                     blocks = Conover$Homeowner,
                     p.adjust.method = "fdr")


Pairwise comparisons using Eisinga, Heskes, Pelzer & Te Grotenhuis all-pairs test with exact p-values for a two-way balanced complete block design

       Grass1 Grass2 Grass3
Grass2 0.094  -      -    
Grass3 0.094  0.938  -    
Grass4 0.701  0.194  0.201

P value adjustment method: fdr


References

 

Conover, W.J. 1999. Practical Nonparametric Statistics, 3rd. John Wiley & Sons.