[banner]

Summary and Analysis of Extension Program Evaluation in R

Salvatore S. Mangiafico

One-way Repeated Ordinal ANOVA with CLMM

A one-way repeated ordinal analysis of variance with a cumulative link model is analogous to the Friedman and Quade tests.  However, data for this model does not need to follow the condition of those tests that there be one and only one observation for each treatment or group in each block.  Blocks can have multiple observations.

 

Appropriate data

•  Two-way data with one treatment variable and one blocking variable

•  Dependent variable is ordered factor

•  Treatment variable is a factor with at least two levels or groups

•  Blocking variable is a factor with at least two levels

•  Observations can be paired or repeated measures within blocks

 

Interpretation

A significant result can be interpreted as, “There was a significant difference among groups.”  Or, “There was a significant effect of Independent Variable.”

 

A significant post-hoc analysis indicates, “There was a significant difference between Group A and Group B”, and so on.

 

Packages used in this chapter

 

The packages used in this chapter include:

•  psych

•  FSA

•  lattice

•  ordinal

•  car

•  RVAideMemoire

•  multcompView

•  lsmeans

•  rcompanion

 

The following commands will install these packages if they are not already installed:

 

if(!require(psych)){install.packages("psych")}
if(!require(FSA)){install.packages("FSA")}

if(!require(lattice)){install.packages("lattice")}
if(!require(ordinal)){install.packages("ordinal")}
if(!require(car)){install.packages("car")}
if(!require(RVAideMemoire)){install.packages("RVAideMemoire")}
if(!require(multcompView)){install.packages("multcompView")}
if(!require(lsmeans)){install.packages("lsmeans")}
if(!require(rcompanion)){install.packages("rcompanion")}


One-way repeated ordinal ANOVA example

 

In this example, we want to determine if there is a difference in scores among five instructors from the Belcher family.  Because we know which rater gave each score, we will use the Rater as a blocking variable, with the suspicion that one rater might rate consistently higher than another one.  If we hadn’t recorded this information about raters, we could use a one-way ordinal anova.


Input =("
 Instructor        Rater  Likert
 'Bob Belcher'        a      4
 'Bob Belcher'        b      5
 'Bob Belcher'        c      4
 'Bob Belcher'        d      6
 'Bob Belcher'        e      6
 'Bob Belcher'        f      6
 'Bob Belcher'        g     10
 'Bob Belcher'        h      6
 'Linda Belcher'      a      8
 'Linda Belcher'      b      6
 'Linda Belcher'      c      8
 'Linda Belcher'      d      8
 'Linda Belcher'      e      8
 'Linda Belcher'      f      7
 'Linda Belcher'      g     10
 'Linda Belcher'      h      9
 'Tina Belcher'       a      7
 'Tina Belcher'       b      5
 'Tina Belcher'       c      7
 'Tina Belcher'       d      8
 'Tina Belcher'       e      8
 'Tina Belcher'       f      9
 'Tina Belcher'       g     10
 'Tina Belcher'       h      9
 'Gene Belcher'       a      6
 'Gene Belcher'       b      4
 'Gene Belcher'       c      5
 'Gene Belcher'       d      5
 'Gene Belcher'       e      6
 'Gene Belcher'       f      6
 'Gene Belcher'       g      5
 'Gene Belcher'       h      5
 'Louise Belcher'     a      8
 'Louise Belcher'     b      7
 'Louise Belcher'     c      8
 'Louise Belcher'     d      8
 'Louise Belcher'     e      9
 'Louise Belcher'     f      9
 'Louise Belcher'     g      8
 'Louise Belcher'     h     10            
")

Data = read.table(textConnection(Input),header=TRUE)


### Order levels of the factor; otherwise R will alphabetize them


Data$Instructor = factor(Data$Instructor,
                         levels=unique(Data$Instructor))


### Create a new variable which is the likert scores as an ordered factor


Data$Likert.f = factor(Data$Likert,
                       ordered=TRUE)


###  Check the data frame


library(psych)

headTail(Data)

str(Data)

summary(Data)


### Remove unnecessary objects

rm(Input)


Summarize data treating Likert scores as factors


xtabs( ~ Instructor + Likert.f,
      data = Data)


                Likert.f
Instructor       4 5 6 7 8 9 10
  Bob Belcher    2 1 4 0 0 0  1
  Linda Belcher  0 0 1 1 4 1  1
  Tina Belcher   0 1 0 2 2 2  1
  Gene Belcher   1 4 3 0 0 0  0
  Louise Belcher 0 0 0 1 4 2  1


XT = xtabs( ~ Instructor + Likert.f,
           data = Data)

prop.table(XT,
           margin = 1)


                Likert.f
Instructor           4     5     6     7     8     9    10
  Bob Belcher    0.250 0.125 0.500 0.000 0.000 0.000 0.125
  Linda Belcher  0.000 0.000 0.125 0.125 0.500 0.125 0.125
  Tina Belcher   0.000 0.125 0.000 0.250 0.250 0.250 0.125
  Gene Belcher   0.125 0.500 0.375 0.000 0.000 0.000 0.000
  Louise Belcher 0.000 0.000 0.000 0.125 0.500 0.250 0.125


Histograms by group

Note that histograms don’t show the effect of the blocking variable.


library(lattice)

histogram(~ Likert.f | Instructor,
          data=Data,
          layout=c(1,5)      #  columns and rows of individual plots
          )

image


Summarize data treating Likert scores as numeric


library(FSA)

Summarize(Likert ~ Instructor,
          data=Data,
          digits=3)


      Instructor n  mean    sd min   Q1 median   Q3 max percZero
1    Bob Belcher 8 5.875 1.885   4 4.75      6 6.00  10        0
2  Linda Belcher 8 8.000 1.195   6 7.75      8 8.25  10        0
3   Tina Belcher 8 7.875 1.553   5 7.00      8 9.00  10        0
4   Gene Belcher 8 5.250 0.707   4 5.00      5 6.00   6        0
5 Louise Belcher 8 8.375 0.916   7 8.00      8 9.00  10        0


One-way repeated ordinal ANOVA

The model is specified using formula notation with the clmm function (not clm).  Here, Likert.f is the dependent variable and Instructor is the independent variable.  Rater is specified as the blocking variable.  Technically, it is identified as the random variable in the mixed model.  The data= option indicates the data frame that contains the variables.  For the meaning of other options, see ?clmm.

 

The p-value for the effect of Instructor is determined using the Anova function in the packages RVAideMemoire and car.

 

Define model and conduct analysis of deviance

Here the threshold = "equidistant" option is used in order to avoid errors in the optional post-hoc tests.  This option does not need to be used routinely.


library(ordinal)

model = clmm(Likert.f ~ Instructor + (1|Rater),
             data = Data,
             threshold = "equidistant")


library(car)

library(RVAideMemoire)

Anova(model,
      type = "II")


Analysis of Deviance Table (Type II tests)

           LR Chisq Df Pr(>Chisq)   
Instructor    38.51  4  8.794e-08 ***


Determine the effect of the random variable

The effect of the random variable (Rater, in this case) can be determined by comparing the full model to a model with only the fixed effects (Instructor, in this case).


First, model.fixed is defined, and then the two models are passed to the anova function.


Be sure that the threshold option in model.fixed is the same as in the original model.


model.fixed = clm(Likert.f ~ Instructor,
                  data = Data,
                  threshold = "equidistant")

anova(model,
       null = model.fixed)


Likelihood ratio tests of cumulative link models:

            no.par    AIC  logLik LR.stat df Pr(>Chisq)  
model.fixed      6 135.19 -61.596                        
model            7 129.46 -57.732  7.7275  1   0.005439 **


Check model assumptions

At the time of writing, the nominal_test and scale_test functions don’t work with clmm model objects, so we will define a similar model with clm and no random effects, and test the proportional odds assumption on this model.


model.clm = clm(Likert.f ~ Instructor + Rater,
                data = Data,
                threshold = "equidistant")

nominal_test(model.clm)


Tests of nominal effects

           Df  logLik    AIC     LRT Pr(>Chi) 
<none>        -47.032 120.06                  
Instructor  4 -42.680 119.36  8.7053   0.0689 .
Rater       7 -39.544 119.09 14.9766   0.0363 *

   ###  No violation in assumptions for Instructor.
   ###  We can ignore Rater since it was a random effect in the original model.



scale_test(model.clm)


Warning messages:
1: (-2) Model failed to converge: degenerate Hessian with 1 negative eigenvalues
In addition: maximum number of consecutive Newton modifications reached

   ###  This test failed.  It would be possible to adjust the fitting parameters,
   ###    and try to get the model to converge.



Post-hoc comparisons with lsmeans

Because the main effect of Instructor in the model was significant, we want to know which instructor has statistically different scores than which other instructors.  Comparing LS means for each instructor will allow us to do this.

Be sure to read the Least Square Means for Multiple Comparisons chapter for correct interpretation of least square means.  For clmm model objects, the lsmean, SE, LCL, and UCL values should be ignored, unless specific options in lsmeans are selected.

Here we’ll create an object of the lsmeans output called leastsquare.  Then we’ll pass this object to the cld function to create a compact letter display.

The ordinal model under consideration is called model, created the clmm function above.  The formula in the lsmeans function indicates that pairwise comparisons should be conducted for the variable Instructor.


library(multcompView)

library(lsmeans)

leastsquare = lsmeans(model,
                      pairwise ~ Instructor,
                      adjust="tukey")         ###  Tukey-adjusted comparisons

leastsquare


cld(leastsquare,
    alpha=0.05,
    Letters=letters,      ### Use lower-case letters for .group
    adjust="tukey")       ### Tukey-adjusted comparisons 


Instructor        lsmean        SE df  asymp.LCL  asymp.UCL .group
 Gene Belcher   -3.957035 0.9610526 NA -6.4257521 -1.4883188  a   
 Bob Belcher    -2.893186 0.8879872 NA -5.1742153 -0.6121576  a   
 Tina Belcher    2.018826 0.7027604 NA  0.2136007  3.8240505   b  
 Linda Belcher   2.378266 0.7417485 NA  0.4728901  4.2836425   b  
 Louise Belcher  3.329647 0.8470188 NA  1.1538560  5.5054375   b  

Results are averaged over the levels of: Rater
Confidence level used: 0.95
Conf-level adjustment: sidak method for 5 estimates
P value adjustment: tukey method for comparing a family of 5 estimates
significance level used: alpha = 0.05

   ### Groups sharing a letter in .group are not significantly different

   ### Remember to ignore “lsmean”, “SE”, “LCL”, and “UCL” for CLMM


Optional analyses

 

Post-hoc test: pairwise paired ordinal tests for multiple comparisons

The post-hoc analysis can be accomplished with my pairwiseOrdinalPairedTest or pairwiseOrdinalPairedMatrix functions.  These functions extract the p-values from paired cumulative link models for each pair of treatments.

 

Usually you would want to first order your treatment variable (Speaker) by median or other location statistic for the letters in the compact letter display to be in their proper order.

Note that for this function, the data must be ordered by the blocking variable so that the first observation where Instructor = Bob is paired to the first observation where Instructor = Linda, and so on.

Here the p-values for the pairwise comparisons are adjusted with the fdr method.  See ?p.adjust for options and details on p-value adjustment methods.

Here the threshold = "equidistant" option is used in order to avoid errors.  This option does not need to be used routinely.


### Order groups by median


Data = Data[order(factor(Data$Instructor,
                         levels=c("Linda Belcher", "Louise Belcher",
                                  "Tina Belcher", "Bob Belcher",
                                  "Gene Belcher"))),]
Data

### Pairwise ordinal regression tests

library(rcompanion)

PT = pairwiseOrdinalPairedTest(x = Data$Likert.f,
                               g = Data$Instructor,
                               b = Data$Rater,
                               threshold = "equidistant",
                               method="fdr")
                                    # Adjusts p-values for multiple comparisons;
                                    # See ?p.adjust for options

PT


                           Comparison   p.value  p.adjust
1  Linda Belcher - Louise Belcher = 0    0.3565 4.456e-01
2    Linda Belcher - Tina Belcher = 0    0.5593 5.593e-01
3     Linda Belcher - Bob Belcher = 0   0.00231 3.850e-03
4    Linda Belcher - Gene Belcher = 0 1.485e-05 7.425e-05
5   Louise Belcher - Tina Belcher = 0    0.2347 3.353e-01
6    Louise Belcher - Bob Belcher = 0  0.001071 2.142e-03
7   Louise Belcher - Gene Belcher = 0 1.247e-07 1.247e-06
8      Tina Belcher - Bob Belcher = 0  0.001013 2.142e-03
9     Tina Belcher - Gene Belcher = 0 0.0001283 4.277e-04
10     Bob Belcher - Gene Belcher = 0    0.5152 5.593e-01


### Compact letter display

library(rcompanion)

cldList(comparison = PT$Comparison,
        p.value    = PT$p.adjust,
        threshold  = 0.05)


          Group Letter MonoLetter
1  LindaBelcher      a         a
2 LouiseBelcher      a         a
3   TinaBelcher      a         a
4    BobBelcher      b          b
5   GeneBelcher      b          b

Groups sharing a letter are not significantly different (alpha = 0.05).

Matrix output and compact letter display


### Order groups by median


Data = Data[order(factor(Data$Instructor,
                         levels=c("Linda Belcher", "Louise Belcher",
                                  "Tina Belcher", "Bob Belcher",
                                  "Gene Belcher"))),]
Data

### Pairwise ordinal regression tests

library(rcompanion)

PM = pairwiseOrdinalPairedMatrix(x = Data$Likert.f,
                                 g = Data$Instructor,
                                 b = Data$Rater,
                                 threshold="equidistant",
                                 method="fdr")
                                    # Adjusts p-values for multiple comparisons;
                                    # See ?p.adjust for options

PM


$Adjusted
               Linda Belcher Louise Belcher Tina Belcher Bob Belcher Gene Belcher
Linda Belcher      1.000e+00      4.456e-01    0.5593000    0.003850    7.425e-05
Louise Belcher     4.456e-01      1.000e+00    0.3353000    0.002142    1.247e-06
Tina Belcher       5.593e-01      3.353e-01    1.0000000    0.002142    4.277e-04
Bob Belcher        3.850e-03      2.142e-03    0.0021420    1.000000    5.593e-01
Gene Belcher       7.425e-05      1.247e-06    0.0004277    0.559300    1.000e+00


library(multcompView)

multcompLetters(PM$Adjusted,
                compare="<",
                threshold=0.05,  # p-value to use as significance threshold
                Letters=letters,
                reversed = FALSE)


Linda Belcher Louise Belcher   Tina Belcher    Bob Belcher   Gene Belcher
          "a"            "a"            "a"            "b"            "b"

   ### Values sharing a letter are not significantly different