Summary and Analysis of Extension Program Evaluation in R

Salvatore S. Mangiafico

One-way Repeated Ordinal Regression with CLMM

Appropriate data

•  Two-way data with one treatment variable and one blocking variable

•  Dependent variable is ordered factor

•  Treatment variable is a factor with at least two levels or groups

•  Blocking variable is a factor with at least two levels

•  Observations can be paired or repeated measures within blocks

Interpretation

A significant result can be interpreted as, “There was a significant difference among groups.”  Or, “There was a significant effect of Independent Variable.”

A significant post-hoc analysis indicates e.g. “There was a significant difference between Group A and Group B”, and so on.

Packages used in this chapter

The packages used in this chapter include:

•  psych

•  FSA

•  lattice

•  ordinal

•  car

•  RVAideMemoire

•  multcompView

•  lsmeans

•  rcompanion

The following commands will install these packages if they are not already installed:

if(!require(psych)){install.packages("psych")}
if(!require(FSA)){install.packages("FSA")}

if(!require(lattice)){install.packages("lattice")}
if(!require(ordinal)){install.packages("ordinal")}
if(!require(car)){install.packages("car")}
if(!require(RVAideMemoire)){install.packages("RVAideMemoire")}
if(!require(multcompView)){install.packages("multcompView")}
if(!require(lsmeans)){install.packages("lsmeans")}
if(!require(rcompanion)){install.packages("rcompanion")}

One-way repeated ordinal regression example

In this example, we want to determine if there is a difference in scores among five instructors from the Belcher family.  Because we know which rater gave each score, we will use the Rater as a blocking variable, with the suspicion that one rater might rate consistently higher than another one.  If we hadn’t recorded this information about raters, we could use a one-way ordinal regression.

Input =("
Instructor        Rater  Likert
'Bob Belcher'        a      4
'Bob Belcher'        b      5
'Bob Belcher'        c      4
'Bob Belcher'        d      6
'Bob Belcher'        e      6
'Bob Belcher'        f      6
'Bob Belcher'        g     10
'Bob Belcher'        h      6
'Linda Belcher'      a      8
'Linda Belcher'      b      6
'Linda Belcher'      c      8
'Linda Belcher'      d      8
'Linda Belcher'      e      8
'Linda Belcher'      f      7
'Linda Belcher'      g     10
'Linda Belcher'      h      9
'Tina Belcher'       a      7
'Tina Belcher'       b      5
'Tina Belcher'       c      7
'Tina Belcher'       d      8
'Tina Belcher'       e      8
'Tina Belcher'       f      9
'Tina Belcher'       g     10
'Tina Belcher'       h      9
'Gene Belcher'       a      6
'Gene Belcher'       b      4
'Gene Belcher'       c      5
'Gene Belcher'       d      5
'Gene Belcher'       e      6
'Gene Belcher'       f      6
'Gene Belcher'       g      5
'Gene Belcher'       h      5
'Louise Belcher'     a      8
'Louise Belcher'     b      7
'Louise Belcher'     c      8
'Louise Belcher'     d      8
'Louise Belcher'     e      9
'Louise Belcher'     f      9
'Louise Belcher'     g      8
'Louise Belcher'     h     10
")

### Order levels of the factor; otherwise R will alphabetize them

Data\$Instructor = factor(Data\$Instructor,
levels=unique(Data\$Instructor))

### Create a new variable which is the likert scores as an ordered factor

Data\$Likert.f = factor(Data\$Likert,
ordered=TRUE)

###  Check the data frame

library(psych)

str(Data)

summary(Data)

### Remove unnecessary objects

rm(Input)

Summarize data treating Likert scores as factors

XT = xtabs( ~ Instructor + Likert.f,
data = Data)

XT

Likert.f
Instructor       4 5 6 7 8 9 10
Bob Belcher    2 1 4 0 0 0  1
Linda Belcher  0 0 1 1 4 1  1
Tina Belcher   0 1 0 2 2 2  1
Gene Belcher   1 4 3 0 0 0  0
Louise Belcher 0 0 0 1 4 2  1

prop.table(XT,
margin = 1)

Likert.f
Instructor           4     5     6     7     8     9    10
Bob Belcher    0.250 0.125 0.500 0.000 0.000 0.000 0.125
Linda Belcher  0.000 0.000 0.125 0.125 0.500 0.125 0.125
Tina Belcher   0.000 0.125 0.000 0.250 0.250 0.250 0.125
Gene Belcher   0.125 0.500 0.375 0.000 0.000 0.000 0.000
Louise Belcher 0.000 0.000 0.000 0.125 0.500 0.250 0.125

### Proportion in each row

Bar plots by group

Note that the bar plots don’t show the effect of the blocking variable.

library(lattice)

histogram(~ Likert.f | Instructor,
data=Data,
layout=c(1,5)      #  columns and rows of individual plots
)

Summarize data treating Likert scores as numeric

library(FSA)

Summarize(Likert ~ Instructor,
data=Data,
digits=3)

Instructor n  mean    sd min   Q1 median   Q3 max percZero
1    Bob Belcher 8 5.875 1.885   4 4.75      6 6.00  10        0
2  Linda Belcher 8 8.000 1.195   6 7.75      8 8.25  10        0
3   Tina Belcher 8 7.875 1.553   5 7.00      8 9.00  10        0
4   Gene Belcher 8 5.250 0.707   4 5.00      5 6.00   6        0
5 Louise Belcher 8 8.375 0.916   7 8.00      8 9.00  10        0

One-way repeated ordinal regression

The model is specified using formula notation with the clmm function (not clm).  Here, Likert.f is the dependent variable, and an ordered factor in R, and Instructor is the independent variable.  Rater is specified as the blocking variable.  Technically, it is identified as the random variable in the mixed model.  The data= option indicates the data frame that contains the variables.  For the meaning of other options, see ?clmm.

The p-value for the effect of Instructor is determined using the Anova function in the packages RVAideMemoire and car.

Define model and conduct analysis of deviance

Here the threshold = "equidistant" option is used in order to avoid errors in the optional post-hoc tests.  This option does not need to be used routinely.

library(ordinal)

model = clmm(Likert.f ~ Instructor + (1|Rater),
data = Data,
threshold = "equidistant")

library(car)

library(RVAideMemoire)

Anova(model,
type = "II")

Analysis of Deviance Table (Type II tests)

LR Chisq Df Pr(>Chisq)
Instructor    38.51  4  8.794e-08 ***

Determine the effect of the random variable

The effect of the random variable (Rater, in this case) can be determined by comparing the full model to a model with only the fixed effects (Instructor, in this case).

First, model.fixed is defined, and then the two models are passed to the anova function.

Be sure that the threshold option in model.fixed is the same as in the original model.

model.fixed = clm(Likert.f ~ Instructor,
data = Data,
threshold = "equidistant")

anova(model,
null = model.fixed)

Likelihood ratio tests of cumulative link models:

no.par    AIC  logLik LR.stat df Pr(>Chisq)
model.fixed      6 135.19 -61.596
model            7 129.46 -57.732  7.7275  1   0.005439 **

Check model assumptions

At the time of writing, the nominal_test and scale_test functions don’t work with clmm model objects, so we will define a similar model with clm and no random effects, and test the proportional odds assumption on this model.

model.clm = clm(Likert.f ~ Instructor + Rater,
data = Data,
threshold = "equidistant")

nominal_test(model.clm)

Tests of nominal effects

Df  logLik    AIC     LRT Pr(>Chi)
<none>        -47.032 120.06
Instructor  4 -42.680 119.36  8.7053   0.0689 .
Rater       7 -39.544 119.09 14.9766   0.0363 *

###  No violation in assumptions for Instructor.
###  We can ignore Rater since it was a random effect in the original model.

scale_test(model.clm)

Warning messages:
1: (-2) Model failed to converge: degenerate Hessian with 1 negative eigenvalues
In addition: maximum number of consecutive Newton modifications reached

###  This test failed.  It would be possible to adjust the fitting parameters,
###    and try to get the model to converge.

Post-hoc comparisons with lsmeans

Because the main effect of Instructor in the model was significant, we want to know which instructor has statistically different scores than which other instructors.  Comparing LS means for each instructor will allow us to do this.

Be sure to read the Least Square Means for Multiple Comparisons chapter for correct interpretation of least square means.  For clmm model objects, the lsmean, SE, LCL, and UCL values should be ignored, unless specific options in lsmeans are selected.

Here we’ll create an object of the lsmeans output called marginal.  Then we’ll pass this object to the cld function to create a compact letter display.

The ordinal model under consideration is called model, created with the clmm function above.  The formula in the lsmeans function indicates that pairwise comparisons should be conducted for the variable Instructor.

library(multcompView)

library(lsmeans)

marginal = lsmeans(model,
pairwise ~ Instructor,

marginal

cld(marginal,
alpha=0.05,
Letters=letters,      ### Use lower-case letters for .group

Instructor        lsmean        SE df  asymp.LCL  asymp.UCL .group
Gene Belcher   -3.957035 0.9610526 NA -6.4257521 -1.4883188  a
Bob Belcher    -2.893186 0.8879872 NA -5.1742153 -0.6121576  a
Tina Belcher    2.018826 0.7027604 NA  0.2136007  3.8240505   b
Linda Belcher   2.378266 0.7417485 NA  0.4728901  4.2836425   b
Louise Belcher  3.329647 0.8470188 NA  1.1538560  5.5054375   b

Results are averaged over the levels of: Rater
Confidence level used: 0.95
Conf-level adjustment: sidak method for 5 estimates
P value adjustment: tukey method for comparing a family of 5 estimates
significance level used: alpha = 0.05

### Groups sharing a letter in .group are not significantly different

### Remember to ignore “lsmean”, “SE”, “LCL”, and “UCL” for CLMM

Optional analyses

Post-hoc test: pairwise paired ordinal tests for multiple comparisons

A post-hoc analysis can be accomplished with the pairwiseOrdinalPairedTest or pairwiseOrdinalPairedMatrix functions.  These functions extract the p-values from paired cumulative link models for each pair of treatments.

Usually you would want to first order your treatment variable (Speaker) by median or other location statistic for the letters in the compact letter display to be in their proper order.

Note that for this function, the data must be ordered by the blocking variable so that the first observation where Instructor = Bob is paired to the first observation where Instructor = Linda, and so on.

Here the p-values for the pairwise comparisons are adjusted with the fdr method.  See ?p.adjust for options and details on p-value adjustment methods.

Here the threshold = "equidistant" option is used in order to avoid errors.  This option does not need to be used routinely.

### Order groups by median

Data\$Instructor = factor(Data\$Instructor,
levels = c("Linda Belcher", "Louise Belcher",
"Tina Belcher", "Bob Belcher",
"Gene Belcher"))

### Pairwise ordinal regression tests

library(rcompanion)

PT = pairwiseOrdinalPairedTest(Likert.f ~ Instructor | Rater,
data      = Data,
threshold = "equidistant",
method    = "fdr")
# Adjusts p-values for multiple comparisons;

PT

1  Linda Belcher - Louise Belcher = 0    0.3565 4.456e-01
2    Linda Belcher - Tina Belcher = 0    0.5593 5.593e-01
3     Linda Belcher - Bob Belcher = 0   0.00231 3.850e-03
4    Linda Belcher - Gene Belcher = 0 1.485e-05 7.425e-05
5   Louise Belcher - Tina Belcher = 0    0.2347 3.353e-01
6    Louise Belcher - Bob Belcher = 0  0.001071 2.142e-03
7   Louise Belcher - Gene Belcher = 0 1.247e-07 1.247e-06
8      Tina Belcher - Bob Belcher = 0  0.001013 2.142e-03
9     Tina Belcher - Gene Belcher = 0 0.0001283 4.277e-04
10     Bob Belcher - Gene Belcher = 0    0.5152 5.593e-01

### Compact letter display

library(rcompanion)

data = PT,
threshold  = 0.05)

Group Letter MonoLetter
1  LindaBelcher      a         a
2 LouiseBelcher      a         a
3   TinaBelcher      a         a
4    BobBelcher      b          b
5   GeneBelcher      b          b

Groups sharing a letter are not significantly different (alpha = 0.05).

Matrix output and compact letter display

### Order groups by median

Data\$Instructor = factor(Data\$Instructor,
levels = c("Linda Belcher", "Louise Belcher",
"Tina Belcher", "Bob Belcher",
"Gene Belcher"))

### Pairwise ordinal regression tests

library(rcompanion)

PM = pairwiseOrdinalPairedMatrix(Likert.f ~ Instructor | Rater,
data      = Data,
threshold = "equidistant",
method    = "fdr")
# Adjusts p-values for multiple comparisons;

PM

Linda Belcher Louise Belcher Tina Belcher Bob Belcher Gene Belcher
Linda Belcher      1.000e+00      4.456e-01    0.5593000    0.003850    7.425e-05
Louise Belcher     4.456e-01      1.000e+00    0.3353000    0.002142    1.247e-06
Tina Belcher       5.593e-01      3.353e-01    1.0000000    0.002142    4.277e-04
Bob Belcher        3.850e-03      2.142e-03    0.0021420    1.000000    5.593e-01
Gene Belcher       7.425e-05      1.247e-06    0.0004277    0.559300    1.000e+00

library(multcompView)