[banner]

Summary and Analysis of Extension Program Evaluation in R

Salvatore S. Mangiafico

One-way Permutation Test for Ordinal Data

When to use this test

 

Here, a permutation test of independence is used for one-way data with an ordinal dependent variable using the independence_test function in the coin package. 

 

This test covers the designs used with the Kruskal–Wallis test.  With different options, this test could be adapted to two-sample t-test, one-way anova, one-way anova with blocks, and their ordinal regression equivalents.

 

The test assumes that the observations are independent.  That is, it is not appropriate for paired observations or repeated measures data.  It does not make assumptions about the distribution of the sampled population.

 

Here, post-hoc testing is conducted with pairwise permutation tests across groups.

 

Appropriate data

•  One-way data.  That is, one measurement variable in two or more groups

•  Dependent variable is numeric or ordinal

•  Independent variable is a factor with two or more levels.  That is, two or more groups

•  Observations between groups are independent.  That is, not paired or repeated measures data

 

Hypotheses

•  Null hypothesis:  The values of the dependent variable among groups are equal in the sampled population.

•  Alternative hypothesis (two-sided): The values of the dependent variable among groups are not equal in the sampled populations.

 

Interpretation

•  Reporting significant results for the omnibus test as “Significant differences were found among values for groups.” is acceptable.  Alternatively, “A significant effect for Independent Variable on Dependent Variable was found.”

•  Reporting significant results for mean separation post-hoc tests as “Value of Dependent Variable for group A was different than that for group B.” is acceptable.

 

Other notes and alternative tests

Ordinal regression is an alternative.

 

The traditional nonparametric tests Mann–Whitney U or Kruskal–Wallis are alternatives in cases where there is not a blocking variable.

 

Packages used in this chapter

 

The packages used in this chapter include:

•  psych

•  lattice

•  FSA

•  coin

•  rcompanion

•  multcompView

 

The following commands will install these packages if they are not already installed:


if(!require(psych)){install.packages("psych")}
if(!require(FSA)){install.packages("FSA")}
if(!require(lattice)){install.packages("lattice")}
if(!require(coin)){install.packages("coin")}
if(!require(rcompanion)){install.packages("rcompanion")}
if(!require(multcompView)){install.packages("multcompView")}


One-way ordinal permutation test example

 

This example re-visits the Pooh, Piglet, and Tigger data.  It answers the question, “Are the scores significantly different among the three speakers?”

 

Here, the ytrafo=rank_tranfo argument is passed to independence_test to indicate that the dependent variable should be rank transformed.


Data = read.table(header=TRUE, stringsAsFactors=TRUE, text="

 Speaker  Likert
 Pooh      3
 Pooh      5
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      5
 Pooh      5
 Piglet    2
 Piglet    4
 Piglet    2
 Piglet    2
 Piglet    1
 Piglet    2
 Piglet    3
 Piglet    2
 Piglet    2
 Piglet    3
 Tigger    4
 Tigger    4
 Tigger    4
 Tigger    4
 Tigger    5
 Tigger    3
 Tigger    5
 Tigger    4
 Tigger    4
 Tigger    3
")


### Order levels of the factor; otherwise R will alphabetize them

Data$Speaker = factor(Data$Speaker,
                      levels=unique(Data$Speaker))

### Create a new variable which is the likert scores as an ordered factor

Data$Likert.f = factor(Data$Likert,
                       ordered = TRUE)


###  Check the data frame

library(psych)

headTail(Data)

str(Data)

summary(Data)


Summarize data treating Likert scores as factors

Note that the variable we want to count is Likert.f, which is a factor variable.  Counts for Likert.f are cross tabulated over values of Speaker.  The prop.table function translates a table into proportions.  The margin=1 option indicates that the proportions are calculated for each row.


xtabs( ~ Speaker + Likert.f,
      data = Data)


        Likert.f
Speaker  1 2 3 4 5
  Pooh   0 0 1 6 3
  Piglet 1 6 2 1 0
  Tigger 0 0 2 6 2


XT = xtabs( ~ Speaker + Likert.f,
           data = Data)

prop.table(XT,
           margin = 1)


        Likert.f
Speaker    1   2   3   4   5
  Pooh   0.0 0.0 0.1 0.6 0.3
  Piglet 0.1 0.6 0.2 0.1 0.0
  Tigger 0.0 0.0 0.2 0.6 0.2


Bar plots by group

Note that the variable we want to count is Likert.f, which is a factor variable.  Counts for Likert.f are presented for values of Speaker.


library(lattice)

histogram(~ Likert.f | Speaker,
          data=Data,
          layout=c(1,3))




Summarize data treating Likert scores as numeric

It may be useful to look at the minimum, first quartile, median, third quartile, and maximum for Likert for each group.


library(FSA)

Summarize(Likert ~ Speaker,
          data=Data,
          digits=3)


  Speaker  n mean    sd min Q1 median   Q3 max percZero
1    Pooh 10  4.2 0.632   3  4      4 4.75   5        0
2  Piglet 10  2.3 0.823   1  2      2 2.75   4        0
3  Tigger 10  4.0 0.667   3  4      4 4.00   5        0


One-way ordinal permutation test

Note that the dependent variable is an ordered factor variable, Likert.fSpeaker is the independent variable and is a factor variable.  The data= option indicates the data frame that contains the variables.  For the meaning of other options, see library(coin); ?independence_test. There is the option of adding a blocking variable to the formula with e.g. | Block.


library(coin)

independence_test(Likert ~ Speaker,
                  data      = Data,
                  ytrafo    = rank_trafo,
                  teststat  = "quadratic")


Asymptotic General Independence Test

chi-squared = 16.842, df = 2, p-value = 0.0002202


Note that there is a built-in function in the coin package to conduct an analysis analogous to the Kruskal–Wallis test.

 

See Hothorn et al. for options in the independence_test function that correspond to common tests.


kruskal_test(Likert ~ Speaker, data=Data)


Asymptotic Kruskal-Wallis Test

chi-squared = 16.842, df = 2, p-value = 0.0002202


Post-hoc test: pairwise permutation tests

If the independence test is significant, a post-hoc analysis can be performed to determine which groups differ from which other groups.

 

The pairwisePermutationTest and pairwisePermutationMatrix functions in the rcompanion package conduct permutation tests across groups in a pairwise manner.   See library(rcompanion); ?pairwisePermutationTest for further details.

 

Because the post-hoc test will produce multiple p-values, adjustments to the p-values can be made to avoid inflating the possibility of making a type-I error.  Here, the method of adjustment is indicated with the method option.  There are a variety of methods for controlling the familywise error rate or for controlling the false discovery rate.  See ?p.adjust for details on these methods.

 

Before conducting the pairwise tests, we will re-order the levels of the grouping variable by the median of each group.  This makes interpretation of the pairwise comparisons and compact letter display easier.  In the output, groups sharing a same letter are not significantly different.

Table output and compact letter display


### Order groups by median

Data$Speaker = factor(Data$Speaker,
                      levels=c("Pooh", "Tigger", "Piglet"))

### Pairwise permutation tests

library(rcompanion)

PT = pairwisePermutationTest(Likert ~ Speaker,
                             data     = Data,
                             ytrafo   = rank_trafo,
                             teststat = "quadratic",
                             method   = "fdr")

PT


           Comparison   Stat   p.value p.adjust
1   Pooh - Tigger = 0 0.4769    0.4898 0.489800
2   Pooh - Piglet = 0   12.5 0.0004065 0.001076
3 Tigger - Piglet = 0  11.44 0.0007175 0.001076



### Compact letter display

library(rcompanion)

cldList(p.adjust ~ Comparison,
        data = PT,
        threshold  = 0.05)


   Group Letter MonoLetter
1   Pooh      a         a
2 Tigger      a         a
3 Piglet      b          b

   Groups sharing a letter are not significantly different (alpha = 0.05).


Matrix output and compact letter display

This code creates a matrix of p-values called PM, which is then passed to the multcompLetters function to be converted to a compact letter display.

 

Here the fdr p-value adjustment method is used.  See ?p.adjust for details on available methods.


### Order groups by median

Data$Speaker = factor(Data$Speaker,
                      levels=c("Pooh", "Tigger", "Piglet"))


### Conduct pairwise permutation tests

library(rcompanion)

PM = pairwisePermutationMatrix(Likert.f ~ Speaker,
                               data     = Data,
                               ytrafo   = rank_trafo,
                               teststat = "quadratic",
                               method   = "fdr")
PMA = PM$Adjusted

PMA


           Pooh   Tigger   Piglet
Pooh   1.000000 0.489800 0.001076
Tigger 0.489800 1.000000 0.001076
Piglet 0.001076 0.001076 1.000000



### Produce compact letter display

library(multcompView)

multcompLetters(PMA,
                compare="<",
                threshold=0.05,
                Letters=letters,
                reversed = FALSE)


  Pooh Tigger Piglet
   "a"    "a"    "b"

Groups sharing a letter are not significantly different (alpha = 0.05).