[banner]

Summary and Analysis of Extension Program Evaluation in R

Salvatore S. Mangiafico

 

Advertisement

Aligned Ranks Transformation ANOVA

 

Advertisement

Introduction

 

Aligned ranks transformation ANOVA (ART anova) is a nonparametric approach that allows for multiple independent variables, interactions, and repeated measures. 

 

My understanding is that, since the aligning process requires subtracting values, the dependent variable needs to be interval in nature.  That is, strictly ordinal data would be treated as numeric in the process.

 

The package ARTool makes using this approach in R relatively easy.

 

A few notes on using ARTool:

 

•  All independent variables must be nominal

 

•  All interactions of fixed independent variables need to be included in the model

 

•  Post-hoc comparisons can be conducted for main effects

 

•  For interactions effects, post-hoc comparisons can be conducted for two-way models.  For more complex models, difference-in-difference post-hoc comparisons can be conducted.

 

•  For fixed-effects models, eta-squared can be calculated as an effect size

 

Packages used in this chapter

 

The packages used in this chapter include:

•  ARTool

•  emmeans

•  rcompanion

•  ggplot2

•  psych

 

The following commands will install these packages if they are not already installed:

 

if(!require(ARTool)){install.packages("ARTool")}
if(!require(emmeans)){install.packages("emmeans")}
if(!require(rcompanion)){install.packages("rcompanion ")}
if(!require(ggplot2)){install.packages("ggplot2")}
if(!require(psych)){install.packages("psych")}



Aligned Ranks Transformation ANOVA examples

 

Midichlorians example

This example reproduces the data used in the Scheirer–Ray–Hare Test chapter.  Note that the aligned ranks anova finds a significant interaction, where the Scheirer–Ray–Hare test failed to detect this.  Also note that the results are similar to those from a standard anova in the Estimated Marginal Means for Multiple Comparisons.

 

Note that code for producing the plot is found at the end of the chapter.

 




### Assemble the data

Location = c(rep("Olympia" , 6), rep("Ventura", 6),
             rep("Northampton", 6), rep("Burlington", 6))

Tribe  = c(rep(c("Jedi", "Sith"), 12))

Midichlorians = c(10,  4, 12,  5, 15,  4, 15,  9, 15, 11, 18, 12,
                   8, 13,  8, 15, 10, 17, 22, 22, 20, 22, 20, 25)

Data = data.frame(Tribe, Location, Midichlorians)

str(Data)


### Aligned ranks anova

library(ARTool)

model = art(Midichlorians ~ Tribe + Location + Tribe:Location,
                data = Data)

### Check the success of the procedure

model


Aligned Rank Transform of Factorial Model

Call:
art(formula = Midichlorians ~ Tribe + Location + Tribe:Location,
    data = Data)

Column sums of aligned responses (should all be ~0):
         Tribe       Location Tribe:Location
             0              0              0


### Conduct ANOVA

anova(model)


Analysis of Variance of Aligned Rank Transformed Data

Table Type: Anova Table (Type III tests)
Model: No Repeated Measures (lm)
Response: art(Midichlorians)

                 Df Df.res F value     Pr(>F)   
1 Tribe           1     16  3.0606   0.099364   .
2 Location        3     16 34.6201 3.1598e-07 ***
3 Tribe:Location  3     16 29.9354 8.4929e-07 ***


Post-hoc comparisons for main effects

It is my understanding that post-hoc comparisons for main effects can be handled by the emmeans package in the usual way, except that the artlm function must be used to first fit a model that can be passed to emmeans.

 

Estimate values in the emmeans output should be ignored.


model.lm = artlm(model, "Location")

library(emmeans)

marginal = emmeans(model.lm,
                   ~ Location)

pairs(marginal,
      adjust = "tukey")


contrast                 estimate   SE df t.ratio p.value
 Burlington - Northampton    10.83 1.78 16  6.075  0.0001
 Burlington - Olympia        17.83 1.78 16 10.000  <.0001
 Burlington - Ventura         7.33 1.78 16  4.112  0.0041
 Northampton - Olympia        7.00 1.78 16  3.925  0.0060
 Northampton - Ventura       -3.50 1.78 16 -1.963  0.2426
 Olympia - Ventura          -10.50 1.78 16 -5.888  0.0001

Results are averaged over the levels of: Tribe
P value adjustment: tukey method for comparing a family of 4 estimates


cld(marginal,
    alpha=0.05,
    Letters=letters,
    adjust="tukey")


Location    emmean   SE df lower.CL upper.CL .group
 Olympia       3.67 1.26 16    0.131      7.2  a   
 Northampton  10.67 1.26 16    7.131     14.2   b  
 Ventura      14.17 1.26 16   10.631     17.7   b  
 Burlington   21.50 1.26 16   17.964     25.0    c 

Results are averaged over the levels of: Tribe
Confidence level used: 0.95
Conf-level adjustment: sidak method for 4 estimates
P value adjustment: tukey method for comparing a family of 4 estimates
significance level used: alpha = 0.05


Post-hoc comparisons for interactions in a two-way model

It is my understanding that for a two-way interaction, the following code can be used for post-hoc comparisons, but that this approach may not be generalizable to other designs.  Note that the cld function cannot be used here.

 

Estimate values in the emmeans output should be ignored.


model.int = artlm(model, "Tribe:Location")

marginal = emmeans(model.int, ~ Tribe:Location)

contrast(marginal, method="pairwise", adjust="none")

   ### For Tukey-adjusted p-values, use adjust="tukey"



### Here, results truncated to comparisons within each Location


contrast                            estimate   SE df t.ratio p.value
 Jedi,Burlington - Sith,Burlington     -6.500 2.68 16 -2.428  0.0273
 Jedi,Northampton - Sith,Northampton  -16.833 2.68 16 -6.288  <.0001
 Jedi,Olympia - Sith,Olympia           15.000 2.68 16  5.603  <.0001
 Jedi,Ventura - Sith,Ventura            9.667 2.68 16  3.611  0.0023



Post-hoc comparisons for other interactions

It is my understanding that for interactions in general, the following code for difference-in-difference post-hoc comparisons can be used.  These sorts of comparisons may not be as useful as other types of comparisons.

 

Estimate values in the emmeans output should be ignored.


model.diff = artlm(model, "Tribe:Location")

marginal = emmeans(model.diff, ~Tribe:Location)

contrast(marginal, method="pairwise", interaction=TRUE)


 Tribe_pairwise Location_pairwise        estimate   SE df t.ratio p.value
 Jedi - Sith    Burlington - Northampton    10.33 3.79 16  2.729  0.0148
 Jedi - Sith    Burlington - Olympia       -21.50 3.79 16 -5.679  <.0001
 Jedi - Sith    Burlington - Ventura       -16.17 3.79 16 -4.270  0.0006
 Jedi - Sith    Northampton - Olympia      -31.83 3.79 16 -8.408  <.0001
 Jedi - Sith    Northampton - Ventura      -26.50 3.79 16 -7.000  <.0001
 Jedi - Sith    Olympia - Ventura            5.33 3.79 16  1.409  0.1781


Here, the comparison with Olympia and Ventura doesn’t indicate that the values for the two cities are similar, but rather that the difference between Jedi and Sith within each of the two cities is similar.  This is consistent with the plot.

 

Partial eta-squared

Partial eta-squared can be calculated as an effect size statistic for aligned ranks transformation anova.

 

Interpretation of eta-squared

Interpretation of effect sizes necessarily varies by discipline and the expectations of the experiment, but for behavioral studies, the guidelines proposed by Cohen (1988) are sometimes followed.  They should not be considered universal.

 

 

Small

 

Medium

Large

eta-squared

0.01 – < 0.06

0.06 – < 0.14

≥ 0.14

____________________________

Source: Cohen (1988).

 

 


Result = anova(model)

Result$part.eta.sq = with(Result, `Sum Sq`/(`Sum Sq` + `Sum Sq.res`))

Result


Analysis of Variance of Aligned Rank Transformed Data

                 Df Df.res F value     Pr(>F) part.eta.sq.   
1 Tribe           1     16  3.0606   0.099364      0.16057   .
2 Location        3     16 34.6201 3.1598e-07      0.86651 ***
3 Tribe:Location  3     16 29.9354 8.4929e-07      0.84878 ***


One-way example

The following example addresses the data from the Kruskal–Wallis Test chapter. Results are relatively similar to results from the Kruskal–Wallis and Dunn tests, and to those from ordinal regression.  Here, the p-value for the global test by ART anova is lower than that from the Kruskal–Wallis test.

 

Input =("
 Speaker  Likert
 Pooh      3
 Pooh      5
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      5
 Pooh      5
 Piglet    2
 Piglet    4
 Piglet    2
 Piglet    2
 Piglet    1
 Piglet    2
 Piglet    3
 Piglet    2
 Piglet    2
 Piglet    3
 Tigger    4
 Tigger    4
 Tigger    4
 Tigger    4
 Tigger    5
 Tigger    3
 Tigger    5
 Tigger    4
 Tigger    4
 Tigger    3
")

Data = read.table(textConnection(Input),header=TRUE)


### Order levels of the factor; otherwise R will alphabetize them


Data$Speaker = factor(Data$Speaker,
                      levels=unique(Data$Speaker))


### Create a new variable which is the likert scores as an ordered factor


Data$Likert.f = factor(Data$Likert,
                       ordered = TRUE)


###  Check the data frame

library(psych)

headTail(Data)

str(Data)

summary(Data)


### Remove unnecessary objects

rm(Input)


### Aligned ranks anova

library(ARTool)

model = art(Likert ~ Speaker,
                data = Data)

anova(model)

 

Analysis of Variance of Aligned Rank Transformed Data

Table Type: Anova Table (Type III tests)
Model: No Repeated Measures (lm)
Response: art(Likert)

          Df Df.res F value     Pr(>F)   
1 Speaker  2     27  18.702 8.0005e-06 ***



### Post-hoc comparisons

model.lm = artlm(model, "Speaker")

library(emmeans)

marginal = emmeans(model.lm,
                   ~ Speaker)

pairs(marginal,
      adjust = "tukey")


contrast        estimate   SE df t.ratio p.value
 Pooh - Piglet       14.1 2.51 27  5.619  <.0001
 Pooh - Tigger        1.8 2.51 27  0.717  0.7555
 Piglet - Tigger    -12.3 2.51 27 -4.901  0.0001

P value adjustment: tukey method for comparing a family of 3 estimates


cld(marginal,
    alpha=0.05,
    Letters=letters,
    adjust="tukey")


Speaker emmean   SE df lower.CL upper.CL .group
 Piglet     6.7 1.77 27     2.18     11.2  a   
 Tigger    19.0 1.77 27    14.48     23.5   b  
 Pooh      20.8 1.77 27    16.28     25.3   b  

Confidence level used: 0.95
Conf-level adjustment: sidak method for 3 estimates
P value adjustment: tukey method for comparing a family of 3 estimates
significance level used: alpha = 0.05


 

### Partial eta squared

Result = anova(model)

Result$part.eta.sq = with(Result, `Sum Sq`/(`Sum Sq` + `Sum Sq.res`))

Result


Analysis of Variance of Aligned Rank Transformed Data

          Df Df.res F value     Pr(>F) part.eta.sq   
1 Speaker  2     27  18.702 8.0005e-06     0.58077 ***


Repeated measures example

The following example addresses the data from the Friedman Test chapter. Results are relatively similar to results from the Friedman and Conover tests, and to those from ordinal regression.  Here, the p-value for the global test by ART anova is lower than that from the Friedman test.

 

Input =("
 Instructor        Rater  Likert
 'Bob Belcher'        a      4
 'Bob Belcher'        b      5
 'Bob Belcher'        c      4
 'Bob Belcher'        d      6
 'Bob Belcher'        e      6
 'Bob Belcher'        f      6
 'Bob Belcher'        g     10
 'Bob Belcher'        h      6
 'Linda Belcher'      a      8
 'Linda Belcher'      b      6
 'Linda Belcher'      c      8
 'Linda Belcher'      d      8
 'Linda Belcher'      e      8
 'Linda Belcher'      f      7
 'Linda Belcher'      g     10
 'Linda Belcher'      h      9
 'Tina Belcher'       a      7
 'Tina Belcher'       b      5
 'Tina Belcher'       c      7
 'Tina Belcher'       d      8
 'Tina Belcher'       e      8
 'Tina Belcher'       f      9
 'Tina Belcher'       g     10
 'Tina Belcher'       h      9
 'Gene Belcher'       a      6
 'Gene Belcher'       b      4
 'Gene Belcher'       c      5
 'Gene Belcher'       d      5
 'Gene Belcher'       e      6
 'Gene Belcher'       f      6
 'Gene Belcher'       g      5
 'Gene Belcher'       h      5
 'Louise Belcher'     a      8
 'Louise Belcher'     b      7
 'Louise Belcher'     c      8
 'Louise Belcher'     d      8
 'Louise Belcher'     e      9
 'Louise Belcher'     f      9
 'Louise Belcher'     g      8
 'Louise Belcher'     h     10            
")

Data = read.table(textConnection(Input),header=TRUE)

### Order levels of the factor; otherwise R will alphabetize them

Data$Instructor = factor(Data$Instructor,
                      levels=unique(Data$Instructor))

### Create a new variable which is the likert scores as an ordered factor

Data$Likert.f = factor(Data$Likert,
                          ordered=TRUE)


###  Check the data frame


library(psych)

headTail(Data)

str(Data)

summary(Data)


### Remove unnecessary objects

rm(Input)

### Aligned ranks anova

library(ARTool)

model = art(Likert ~ Instructor + (1|Rater),
                data = Data)

anova(model)


Analysis of Variance of Aligned Rank Transformed Data

Table Type: Analysis of Deviance Table (Type III Wald F tests with Kenward-Roger df)
Model: Mixed Effects (lmer)
Response: art(Likert)

                  F Df Df.res     Pr(>F)   
1 Instructor 16.052  4     28 6.0942e-07 ***


### Post-hoc comparisons

model.lm = artlm(model, "Instructor")

library(emmeans)

marginal = emmeans(model.lm,
                   ~ Speaker)

pairs(marginal,
      adjust = "tukey")


contrast                       estimate   SE df t.ratio p.value
 Bob Belcher - Linda Belcher     -13.562 3.23 28 -4.202  0.0021
 Bob Belcher - Tina Belcher      -12.750 3.23 28 -3.950  0.0040
 Bob Belcher - Gene Belcher        4.312 3.23 28  1.336  0.6717
 Bob Belcher - Louise Belcher    -16.125 3.23 28 -4.996  0.0003
 Linda Belcher - Tina Belcher      0.812 3.23 28  0.252  0.9991
 Linda Belcher - Gene Belcher     17.875 3.23 28  5.538  0.0001
 Linda Belcher - Louise Belcher   -2.562 3.23 28 -0.794  0.9302
 Tina Belcher - Gene Belcher      17.062 3.23 28  5.287  0.0001
 Tina Belcher - Louise Belcher    -3.375 3.23 28 -1.046  0.8319
 Gene Belcher - Louise Belcher   -20.438 3.23 28 -6.332  <.0001

P value adjustment: tukey method for comparing a family of 5 estimates


cld(marginal,
    alpha=0.05,
    Letters=letters,
    adjust="tukey")


Instructor     emmean   SE   df lower.CL upper.CL .group
 Gene Belcher     8.56 2.98 20.7    0.135     17.0  a   
 Bob Belcher     12.88 2.98 20.7    4.447     21.3  a   
 Tina Belcher    25.62 2.98 20.7   17.197     34.1   b  
 Linda Belcher   26.44 2.98 20.7   18.010     34.9   b  
 Louise Belcher  29.00 2.98 20.7   20.572     37.4   b  

Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Conf-level adjustment: sidak method for 5 estimates
P value adjustment: tukey method for comparing a family of 5 estimates
significance level used: alpha = 0.05


Optional:  Plot of medians and confidence intervals for midichlorians data


library(rcompanion)

Sum = groupwiseMedian(Midichlorians ~ Tribe + Location,
                      data=Data,
                      bca=FALSE, percentile=TRUE)

Sum


  Tribe    Location n Median Conf.level Percentile.lower Percentile.upper
1  Jedi  Burlington 3     20       0.95               20               22
2  Jedi Northampton 3      8       0.95                8               10
3  Jedi     Olympia 3     12       0.95               10               15
4  Jedi     Ventura 3     15       0.95               15               18
5  Sith  Burlington 3     22       0.95               22               25
6  Sith Northampton 3     15       0.95               13               17
7  Sith     Olympia 3      4       0.95                4                5
8  Sith     Ventura 3     11       0.95                9               12


### Order the levels for printing

Sum$Location = factor(Sum$Location,
                       levels=c("Olympia", "Ventura", "Northampton", "Burlington"))

Sum$Tribe = factor(Sum$Tribe,
                       levels=c("Jedi", "Sith"))


### Plot

library(ggplot2)

pd = position_dodge(0.4)    ### How much to jitter the points on the plot

png(filename = "Rplot01.png",
    width  = 5,
    height = 5,
    units  = "in",
    res    = 300)

ggplot(Sum,
       aes(x     = Location,
           y     = Median,
           color = Tribe)) +

    geom_point(shape  = 15,
               size   = 4,
             position = pd) +

    geom_errorbar(aes(ymin  =  Percentile.lower,
                      ymax  =  Percentile.upper),
                      width =  0.2,
                      size  =  0.7,
                      position = pd) +

    theme_bw() +
    theme(axis.title   = element_text(face = "bold"),
          axis.text    = element_text(face = "bold"),
          plot.caption = element_text(hjust = 0)) +

    ylab("Median midichlorian count") +
     ggtitle ("Midichlorian counts for Jedi and Sith",
            subtitle = "In four U.S. cities") +
 
            labs(caption  = paste0("\nMidichlorian counts for two tribes across ",
                                   "four locations. Boxes indicate \n",
                                   "the median. ",
                                   "Error bars indicate the 95% confidence ",
                                   "interval ",
                                    "of the median."),
                            hjust=0.5) +
 
  scale_color_manual(values = c("blue", "red"))

dev.off()

 

References

 

Cohen, J. 1988. Statistical Power Analysis for the Behavioral Sciences, 2nd Edition. Routledge.

 

Kay, M. 2019. Contrast tests with ART. cran.r-project.org/web/packages/ARTool/vignettes/art-contrasts.html.

 

Kay, M. 2019. Effect Sizes with ART. https://cran.r-project.org/web/packages/ARTool/vignettes/art-effect-size.html.

 

Kay, M. 2019. Package ‘ARTool’. cran.r-project.org/web/packages/ARTool/ARTool.pdf.

 

Wobbrock, J. O., Findlater, L., Gergle, D., & Higgins, J. J. 2011. The aligned rank transform for nonparametric factorial analyses using only anova procedures. In Conference on Human Factors in Computing Systems (pp. 143–146). faculty.washington.edu/wobbrock/pubs/chi-11.06.pdf.

 

Wobbrock, J. O., Findlater, L., Gergle, D., Higgins, J. J., & Kay, M. 2018. ARTool: Align-and-rank data for a nonparametric ANOVA. University of Washington. depts.washington.edu/madlab/proj/art/index.html.