﻿ R Handbook: Aligned Ranks Transformation ANOVA ## Summary and Analysis of Extension Program Evaluation in R

Salvatore S. Mangiafico

# Aligned Ranks Transformation ANOVA

### Introduction

Aligned ranks transformation ANOVA (ART anova) is a nonparametric approach that allows for multiple independent variables, interactions, and repeated measures.

My understanding is that, since the aligning process requires subtracting values, the dependent variable needs to be interval in nature.  That is, strictly ordinal data would be treated as numeric in the process.

The package ARTool makes using this approach in R relatively easy.

A few notes on using ARTool:

•  All independent variables must be nominal

•  All interactions of fixed independent variables need to be included in the model

•  Post-hoc comparisons can be conducted

•  For fixed-effects models, eta-squared can be calculated as an effect size

### Packages used in this chapter

The packages used in this chapter include:

•  ARTool

•  emmeans

•  multcomp

•  rcompanion

•  ggplot2

•  psych

The following commands will install these packages if they are not already installed:

if(!require(ARTool)){install.packages("ARTool")}
if(!require(emmeans)){install.packages("emmeans")}
if(!require(multcomp)){install.packages("multcomp")}
if(!require(rcompanion)){install.packages("rcompanion")}
if(!require(ggplot2)){install.packages("ggplot2")}
if(!require(psych)){install.packages("psych")}

### Aligned Ranks Transformation ANOVA examples

#### Midichlorians example

This example reproduces the data used in the Scheirer–Ray–Hare Test chapter.  Note that the aligned ranks anova finds a significant interaction, where the Scheirer–Ray–Hare test failed to detect this.  Also note that the results are similar to those from a standard anova in the Estimated Marginal Means for Multiple Comparisons chapter.

The code for producing the plot is found at the end of the chapter. ### Assemble the data

Location = as.factor(c(rep("Olympia" , 6), rep("Ventura", 6),
rep("Northampton", 6), rep("Burlington", 6)))

Tribe  = as.factor(c(rep(c("Jedi", "Sith"), 12)))

Midichlorians = c(10,  4, 12,  5, 15,  4, 15,  9, 15, 11, 18, 12,
8, 13,  8, 15, 10, 17, 22, 22, 20, 22, 20, 25)

Data = data.frame(Tribe, Location, Midichlorians)

str(Data)

### Aligned ranks anova

library(ARTool)

model = art(Midichlorians ~ Tribe + Location + Tribe:Location,
data = Data)

### Check the success of the procedure

model

Aligned Rank Transform of Factorial Model

Call:
art(formula = Midichlorians ~ Tribe + Location + Tribe:Location,
data = Data)

Column sums of aligned responses (should all be ~0):
Tribe       Location Tribe:Location
0              0              0

### Conduct ANOVA

anova(model)

Analysis of Variance of Aligned Rank Transformed Data

Table Type: Anova Table (Type III tests)
Model: No Repeated Measures (lm)
Response: art(Midichlorians)

Df Df.res F value     Pr(>F)
1 Tribe           1     16  3.0606   0.099364   .
2 Location        3     16 34.6201 3.1598e-07 ***
3 Tribe:Location  3     16 29.9354 8.4929e-07 ***

##### Post-hoc comparisons for main effects

marginal = art.con(model, "Location")

marginal

contrast                 estimate   SE df t.ratio p.value
Burlington - Northampton    10.83 1.78 16  6.075  0.0001
Burlington - Olympia        17.83 1.78 16 10.000  <.0001
Burlington - Ventura         7.33 1.78 16  4.112  0.0041
Northampton - Olympia        7.00 1.78 16  3.925  0.0060
Northampton - Ventura       -3.50 1.78 16 -1.963  0.2426
Olympia - Ventura          -10.50 1.78 16 -5.888  0.0001

Results are averaged over the levels of: Tribe
P value adjustment: tukey method for comparing a family of 4 estimates

Sum = as.data.frame(marginal)

library(rcompanion)

cldList(p.value ~ contrast, data=Sum)

Group Letter MonoLetter
1  Burlington      a        a
2 Northampton      b         b
3     Olympia      c          c
4     Ventura      b         b

##### Post-hoc comparisons for interactions in a two-way model

Estimate values in the emmeans output should be ignored.

marginal

### Here, results truncated to comparisons within each Location

contrast                            estimate   SE df t.ratio p.value
Jedi,Burlington - Sith,Burlington      -2.33 1.71 16  -1.365  0.1913
Jedi,Northampton - Sith,Northampton    -9.00 1.71 16  -5.264  0.0001
Jedi,Olympia - Sith,Olympia             8.83 1.71 16   5.166  0.0001
Jedi,Ventura - Sith,Ventura             7.17 1.71 16   4.191  0.0007

##### Partial eta-squared

Partial eta-squared can be calculated as an effect size statistic for aligned ranks transformation anova.

###### Interpretation of eta-squared

Interpretation of effect sizes necessarily varies by discipline and the expectations of the experiment, but for behavioral studies, the guidelines proposed by Cohen (1988) are sometimes followed.  They should not be considered universal.

 Small Medium Large eta-squared 0.01 – < 0.06 0.06 – < 0.14 ≥ 0.14

____________________________

Source: Cohen (1988).

Result = anova(model)

Result\$part.eta.sq = with(Result, `Sum Sq`/(`Sum Sq` + `Sum Sq.res`))

Result

Analysis of Variance of Aligned Rank Transformed Data

Df Df.res F value     Pr(>F) part.eta.sq.
1 Tribe           1     16  3.0606   0.099364      0.16057   .
2 Location        3     16 34.6201 3.1598e-07      0.86651 ***
3 Tribe:Location  3     16 29.9354 8.4929e-07      0.84878 ***

Alternatively, partial eta-squared can be calculated from the F value and degrees of freedom.

Result = anova(model)

Result\$part.eta.sq = with(Result, `F value` * `Df` / (`F value` * `Df` + `Df.res`))

Result

Analysis of Variance of Aligned Rank Transformed Data

Df Df.res F value     Pr(>F) part.eta.sq.
1 Tribe           1     16  3.0606   0.099364      0.16057   .
2 Location        3     16 34.6201 3.1598e-07      0.86651 ***
3 Tribe:Location  3     16 29.9354 8.4929e-07      0.84878 ***

##### Efron’s pseudo r-squared, RMSE

Efron’s pseudo r-squared is based on the actual values of the dependent variable and the values predicted by the model.

At the time of writing, the ARTool object doesn’t contain the actual values of the dependent variable, so these values have to be called from the original data.

Also, at the time of writing, the residuals in the ARTool object don’t reflect the random effects in the model.  The effect is that Efron’s pseudo r-squared for an ARTool object will be equal to the r-squared for a similar linear model ignoring any random effects in the ARTool model.

library(rcompanion)

efronRSquared(actual   = Data\$Midichlorians,
residual = model\$residuals)

EfronRSquared
0.948

model.lm = lm(Midichlorians ~ Tribe + Location + Tribe:Location,
data = Data)

summary(model.lm)\$r.squared

 0.9484945

The efronRSquared function can produce other useful statistics, like mean absolute percent error (as a fraction), root mean square error, and coefficient of variation as a percentage.

library(rcompanion)

efronRSquared(actual     = Data\$Midichlorians,
residual   = model\$residuals,
statistic = "MAPE")

efronRSquared(actual     = Data\$Midichlorians,
residual   = model\$residuals,
statistic = "RMSE")

efronRSquared(actual     = Data\$Midichlorians,
residual   = model\$residuals,
statistic = "CV")

MAPE
0.0908

RMSE
1.34

CV
9.71

#### One-way example

The following example addresses the data from the Kruskal–Wallis Test chapter. The results are relatively similar to results from the Kruskal–Wallis and Dunn tests, and to those from ordinal regression.  Here, the p-value for the global test by ART anova is lower than that from the Kruskal–Wallis test.

Speaker  Likert
Pooh      3
Pooh      5
Pooh      4
Pooh      4
Pooh      4
Pooh      4
Pooh      4
Pooh      4
Pooh      5
Pooh      5
Piglet    2
Piglet    4
Piglet    2
Piglet    2
Piglet    1
Piglet    2
Piglet    3
Piglet    2
Piglet    2
Piglet    3
Tigger    4
Tigger    4
Tigger    4
Tigger    4
Tigger    5
Tigger    3
Tigger    5
Tigger    4
Tigger    4
Tigger    3
")

### Order levels of the factor; otherwise R will alphabetize them

Data\$Speaker = factor(Data\$Speaker,
levels=unique(Data\$Speaker))

###  Check the data frame

library(psych)

str(Data)

summary(Data)

### Aligned ranks anova

library(ARTool)

model = art(Likert ~ Speaker,
data = Data)

anova(model)

Analysis of Variance of Aligned Rank Transformed Data

Table Type: Anova Table (Type III tests)
Model: No Repeated Measures (lm)
Response: art(Likert)

Df Df.res F value     Pr(>F)
1 Speaker  2     27  18.702 8.0005e-06 ***

### Post-hoc comparisons

model.lm = artlm(model, "Speaker")

library(emmeans)

marginal = emmeans(model.lm,
~ Speaker)

pairs(marginal,

contrast        estimate   SE df t.ratio p.value
Pooh - Piglet       14.1 2.51 27  5.619  <.0001
Pooh - Tigger        1.8 2.51 27  0.717  0.7555
Piglet - Tigger    -12.3 2.51 27 -4.901  0.0001

P value adjustment: tukey method for comparing a family of 3 estimates

Sum = as.data.frame(marginal)

library(rcompanion)

cldList(p.value ~ contrast, data=Sum)

Group Letter MonoLetter
1   Pooh      a         a
2 Piglet      b          b
3 Tigger      a         a

##### Partial eta-squared

Result = anova(model)

Result\$part.eta.sq = with(Result, `Sum Sq`/(`Sum Sq` + `Sum Sq.res`))

Result

Analysis of Variance of Aligned Rank Transformed Data

Df Df.res F value     Pr(>F) part.eta.sq
1 Speaker  2     27  18.702 8.0005e-06     0.58077 ***

##### Efron’s pseudo r-squared

library(rcompanion)

efronRSquared(actual   = Data\$Likert,
residual = model\$residuals)

EfronRSquared

0.614

#### Repeated measures example

The following example addresses the data from the Friedman Test chapter. Results are relatively similar to results from the Friedman and Conover tests, and to those from ordinal regression.  Here, the p-value for the global test by ART anova is lower than that from the Friedman test.

Instructor        Rater  Likert
'Bob Belcher'        a      4
'Bob Belcher'        b      5
'Bob Belcher'        c      4
'Bob Belcher'        d      6
'Bob Belcher'        e      6
'Bob Belcher'        f      6
'Bob Belcher'        g     10
'Bob Belcher'        h      6
'Linda Belcher'      a      8
'Linda Belcher'      b      6
'Linda Belcher'      c      8
'Linda Belcher'      d      8
'Linda Belcher'      e      8
'Linda Belcher'      f      7
'Linda Belcher'      g     10
'Linda Belcher'      h      9
'Tina Belcher'       a      7
'Tina Belcher'       b      5
'Tina Belcher'       c      7
'Tina Belcher'       d      8
'Tina Belcher'       e      8
'Tina Belcher'       f      9
'Tina Belcher'       g     10
'Tina Belcher'       h      9
'Gene Belcher'       a      6
'Gene Belcher'       b      4
'Gene Belcher'       c      5
'Gene Belcher'       d      5
'Gene Belcher'       e      6
'Gene Belcher'       f      6
'Gene Belcher'       g      5
'Gene Belcher'       h      5
'Louise Belcher'     a      8
'Louise Belcher'     b      7
'Louise Belcher'     c      8
'Louise Belcher'     d      8
'Louise Belcher'     e      9
'Louise Belcher'     f      9
'Louise Belcher'     g      8
'Louise Belcher'     h     10
")

### Order levels of the factor; otherwise R will alphabetize them

Data\$Instructor = factor(Data\$Instructor,
levels=unique(Data\$Instructor))

###  Check the data frame

library(psych)

str(Data)

summary(Data)

### Aligned ranks anova

library(ARTool)

model = art(Likert ~ Instructor + (1|Rater),
data = Data)

anova(model)

Analysis of Variance of Aligned Rank Transformed Data

Table Type: Analysis of Deviance Table (Type III Wald F tests with Kenward-Roger df)
Model: Mixed Effects (lmer)
Response: art(Likert)

F Df Df.res     Pr(>F)
1 Instructor 16.052  4     28 6.0942e-07 ***

### Post-hoc comparisons

marginal = art.con(model, "Instructor")

marginal

contrast                       estimate   SE df t.ratio p.value
Bob Belcher - Linda Belcher     -13.562 3.23 28  -4.202  0.0021
Bob Belcher - Tina Belcher      -12.750 3.23 28  -3.950  0.0040
Bob Belcher - Gene Belcher        4.312 3.23 28   1.336  0.6717
Bob Belcher - Louise Belcher    -16.125 3.23 28  -4.996  0.0003
Linda Belcher - Tina Belcher      0.812 3.23 28   0.252  0.9991
Linda Belcher - Gene Belcher     17.875 3.23 28   5.538  0.0001
Linda Belcher - Louise Belcher   -2.562 3.23 28  -0.794  0.9302
Tina Belcher - Gene Belcher      17.062 3.23 28   5.287  0.0001
Tina Belcher - Louise Belcher    -3.375 3.23 28  -1.046  0.8319
Gene Belcher - Louise Belcher   -20.438 3.23 28  -6.332  <.0001

Degrees-of-freedom method: kenward-roger

P value adjustment: tukey method for comparing a family of 5 estimates

Sum = as.data.frame(marginal)

library(rcompanion)

cldList(p.value ~ contrast, data=Sum)

Group Letter MonoLetter
1    BobBelcher      a         a
2  LindaBelcher      b          b
3   TinaBelcher      b          b
4   GeneBelcher      a         a
5 LouiseBelcher      b          b

##### Partial eta-squared

For mixed effects models, the partial eta-squared can be calculated from the F values and the degrees of freedom.

Result = anova(model)

Result\$part.eta.sq = with(Result, `F` * `Df` / (`F` * `Df` + `Df.res`))

Result

Analysis of Variance of Aligned Rank Transformed Data

Table Type: Analysis of Deviance Table (Type III Wald F tests with Kenward-Roger df)
Model: Mixed Effects (lmer)
Response: art(Likert)

F Df Df.res     Pr(>F) part.eta.sq
1 Instructor 16.052  4     28 6.0942e-07     0.69633 ***

##### Efron’s pseudo r-squared

At the time of writing, the residuals in the ARTool object don’t reflect the random effects in the model.  The effect is that Efron’s pseudo r-squared for an ARTool object will be equal to the r-squared for a similar linear model ignoring any random effects in the ARTool model.

library(rcompanion)

efronRSquared(actual   = Data\$Likert,
residual = model\$residuals)

EfronRSquared
0.51

model.lm = lm(Likert ~ Instructor,
data = Data)

summary(model.lm)\$r.squared

 0.5101182

### Optional:  Plot of medians and confidence intervals for midichlorians data

library(rcompanion)

Sum = groupwiseMedian(Midichlorians ~ Tribe + Location,
data=Data,
bca=FALSE, percentile=TRUE)

Sum

Tribe    Location n Median Conf.level Percentile.lower Percentile.upper
1  Jedi  Burlington 3     20       0.95               20               22
2  Jedi Northampton 3      8       0.95                8               10
3  Jedi     Olympia 3     12       0.95               10               15
4  Jedi     Ventura 3     15       0.95               15               18
5  Sith  Burlington 3     22       0.95               22               25
6  Sith Northampton 3     15       0.95               13               17
7  Sith     Olympia 3      4       0.95                4                5
8  Sith     Ventura 3     11       0.95                9               12

### Order the levels for printing

Sum\$Location = factor(Sum\$Location,
levels=c("Olympia", "Ventura", "Northampton", "Burlington"))

Sum\$Tribe = factor(Sum\$Tribe,
levels=c("Jedi", "Sith"))

### Plot

library(ggplot2)

pd = position_dodge(0.4)    ### How much to jitter the points on the plot

png(filename = "Rplot01.png",
width  = 5,
height = 5,
units  = "in",
res    = 300)

ggplot(Sum,
aes(x     = Location,
y     = Median,
color = Tribe)) +

geom_point(shape  = 15,
size   = 4,
position = pd) +

geom_errorbar(aes(ymin  =  Percentile.lower,
ymax  =  Percentile.upper),
width =  0.2,
size  =  0.7,
position = pd) +

theme_bw() +
theme(axis.title   = element_text(face = "bold"),
axis.text    = element_text(face = "bold"),
plot.caption = element_text(hjust = 0)) +

ylab("Median midichlorian count") +
ggtitle ("Midichlorian counts for Jedi and Sith",
subtitle = "In four U.S. cities") +

labs(caption  = paste0("\nMidichlorian counts for two tribes across ",
"four locations. Boxes indicate \n",
"the median. ",
"Error bars indicate the 95% confidence ",
"interval ",
"of the median."),
hjust=0.5) +

scale_color_manual(values = c("blue", "red"))

dev.off()

### References

Cohen, J. 1988. Statistical Power Analysis for the Behavioral Sciences, 2nd Edition. Routledge.

Elkin, L.A., Kay, M., Higgins, J. and Wobbrock, J.O. (2021). An aligned rank transform procedure for multifactor contrast tests. Proceedings of the ACM Symposium on User Interface Software and Technology (UIST '21). New York: ACM Press, pp. 754–768. [DOI]. faculty.washington.edu/wobbrock/pubs/uist-21.pdf.

Kay, M. 2019. Contrast tests with ART. cran.r-project.org/web/packages/ARTool/vignettes/art-contrasts.html.

Kay, M. 2019. Effect Sizes with ART. cran.r-project.org/web/packages/ARTool/vignettes/art-effect-size.html.

Kay, M. 2019. Package ‘ARTool’. cran.r-project.org/web/packages/ARTool/ARTool.pdf.

Wobbrock, J. O., Findlater, L., Gergle, D., & Higgins, J. J. (2011). The aligned rank transform for nonparametric factorial analyses using only anova procedures. In Conference on Human Factors in Computing Systems (pp. 143–146). faculty.washington.edu/wobbrock/pubs/chi-11.06.pdf.

Wobbrock, J. O., Findlater, L., Gergle, D., Higgins, J. J., & Kay, M. (2018). ARTool: Align-and-rank data for a nonparametric ANOVA. University of Washington. Retrieved from depts.washington.edu/madlab/proj/art/index.html.