Introduction
Aligned ranks transformation ANOVA (ART anova) is a nonparametric approach that allows for multiple independent variables, interactions, and repeated measures.
My understanding is that, since the aligning process requires subtracting values, the dependent variable needs to be interval in nature. That is, strictly ordinal data would be treated as numeric in the process.
The package ARTool makes using this approach in R relatively easy.
A few notes on using ARTool:
• All independent variables must be nominal
• All interactions of fixed independent variables need to be included in the model
• Post-hoc comparisons can be conducted
• For fixed-effects models, eta-squared can be calculated as an effect size
Packages used in this chapter
The packages used in this chapter include:
• ARTool
• emmeans
• multcomp
• rcompanion
• ggplot2
• psych
The following commands will install these packages if they are not already installed:
if(!require(ARTool)){install.packages("ARTool")}
if(!require(emmeans)){install.packages("emmeans")}
if(!require(multcomp)){install.packages("multcomp")}
if(!require(rcompanion)){install.packages("rcompanion")}
if(!require(ggplot2)){install.packages("ggplot2")}
if(!require(psych)){install.packages("psych")}
Aligned Ranks Transformation ANOVA examples
Midichlorians example
This example reproduces the data used in the Scheirer–Ray–Hare Test chapter. Note that the aligned ranks anova finds a significant interaction, where the Scheirer–Ray–Hare test failed to detect this. Also note that the results are similar to those from a standard anova in the Estimated Marginal Means for Multiple Comparisons chapter.
The code for producing the plot is found at the end of the chapter.
### Assemble the data
Location = as.factor(c(rep("Olympia" , 6), rep("Ventura",
6),
rep("Northampton", 6), rep("Burlington", 6)))
Tribe = as.factor(c(rep(c("Jedi", "Sith"), 12)))
Midichlorians = c(10, 4, 12, 5, 15, 4, 15, 9, 15, 11, 18, 12,
8, 13, 8, 15, 10, 17, 22, 22, 20, 22, 20, 25)
Data = data.frame(Tribe, Location, Midichlorians)
str(Data)
### Aligned ranks anova
library(ARTool)
model = art(Midichlorians ~ Tribe + Location + Tribe:Location,
data = Data)
### Check the success of the procedure
model
Aligned Rank Transform of Factorial Model
Call:
art(formula = Midichlorians ~ Tribe + Location + Tribe:Location,
data = Data)
Column sums of aligned responses (should all be ~0):
Tribe Location Tribe:Location
0 0 0
### Conduct ANOVA
anova(model)
Analysis of Variance of Aligned Rank Transformed Data
Table Type: Anova Table (Type III tests)
Model: No Repeated Measures (lm)
Response: art(Midichlorians)
Df Df.res F value Pr(>F)
1 Tribe 1 16 3.0606 0.099364 .
2 Location 3 16 34.6201 3.1598e-07 ***
3 Tribe:Location 3 16 29.9354 8.4929e-07 ***
Post-hoc comparisons for main effects
marginal = art.con(model, "Location")
marginal
contrast estimate SE df t.ratio p.value
Burlington - Northampton 10.83 1.78 16 6.075 0.0001
Burlington - Olympia 17.83 1.78 16 10.000 <.0001
Burlington - Ventura 7.33 1.78 16 4.112 0.0041
Northampton - Olympia 7.00 1.78 16 3.925 0.0060
Northampton - Ventura -3.50 1.78 16 -1.963 0.2426
Olympia - Ventura -10.50 1.78 16 -5.888 0.0001
Results are averaged over the levels of: Tribe
P value adjustment: tukey method for comparing a family of 4 estimates
Sum = as.data.frame(marginal)
library(rcompanion)
cldList(p.value ~ contrast, data=Sum)
Group Letter MonoLetter
1 Burlington a a
2 Northampton b b
3 Olympia c c
4 Ventura b b
Post-hoc comparisons for interactions in a two-way model
Estimate values in the emmeans output should be ignored.
marginal = art.con(model, "Tribe:Location", adjust="none")
marginal
### For Tukey-adjusted p-values, use
adjust="tukey"
### Here, results truncated to comparisons within each Location
contrast estimate SE df t.ratio p.value
Jedi,Burlington - Sith,Burlington -2.33 1.71 16 -1.365 0.1913
Jedi,Northampton - Sith,Northampton -9.00 1.71 16 -5.264 0.0001
Jedi,Olympia - Sith,Olympia 8.83 1.71 16 5.166 0.0001
Jedi,Ventura - Sith,Ventura 7.17 1.71 16 4.191 0.0007
Partial eta-squared
Partial eta-squared can be calculated as an effect size statistic for aligned ranks transformation anova.
Interpretation of eta-squared
Interpretation of effect sizes necessarily varies by discipline and the expectations of the experiment, but for behavioral studies, the guidelines proposed by Cohen (1988) are sometimes followed. They should not be considered universal.
|
Small
|
Medium |
Large |
eta-squared |
0.01 – < 0.06 |
0.06 – < 0.14 |
≥ 0.14 |
____________________________
Source: Cohen (1988).
Result = anova(model)
Result$part.eta.sq = with(Result, `Sum Sq`/(`Sum Sq` + `Sum Sq.res`))
Result
Analysis of Variance of Aligned Rank Transformed Data
Df Df.res F value Pr(>F) part.eta.sq.
1 Tribe 1 16 3.0606 0.099364 0.16057 .
2 Location 3 16 34.6201 3.1598e-07 0.86651 ***
3 Tribe:Location 3 16 29.9354 8.4929e-07 0.84878 ***
Alternatively, partial eta-squared can be calculated from the F value and degrees of freedom.
Result = anova(model)
Result$part.eta.sq = with(Result, `F value` * `Df` / (`F value` * `Df` +
`Df.res`))
Result
Analysis of Variance of Aligned Rank Transformed Data
Df Df.res F value Pr(>F) part.eta.sq.
1 Tribe 1 16 3.0606 0.099364 0.16057 .
2 Location 3 16 34.6201 3.1598e-07 0.86651 ***
3 Tribe:Location 3 16 29.9354 8.4929e-07 0.84878 ***
Efron’s pseudo r-squared, RMSE
Efron’s pseudo r-squared is based on the actual values of the dependent variable and the values predicted by the model.
At the time of writing, the ARTool object doesn’t contain the actual values of the dependent variable, so these values have to be called from the original data.
Also, at the time of writing, the residuals in the ARTool object don’t reflect the random effects in the model. The effect is that Efron’s pseudo r-squared for an ARTool object will be equal to the r-squared for a similar linear model ignoring any random effects in the ARTool model.
library(rcompanion)
efronRSquared(actual = Data$Midichlorians,
residual = model$residuals)
EfronRSquared
0.948
model.lm = lm(Midichlorians ~ Tribe + Location + Tribe:Location,
data = Data)
summary(model.lm)$r.squared
[1] 0.9484945
The efronRSquared function can produce other useful statistics, like mean absolute percent error (as a fraction), root mean square error, and coefficient of variation as a percentage.
library(rcompanion)
efronRSquared(actual = Data$Midichlorians,
residual = model$residuals,
statistic = "MAPE")
efronRSquared(actual = Data$Midichlorians,
residual = model$residuals,
statistic = "RMSE")
efronRSquared(actual = Data$Midichlorians,
residual = model$residuals,
statistic = "CV")
MAPE
0.0908
RMSE
1.34
CV
9.71
One-way example
The following example addresses the data from the Kruskal–Wallis Test chapter. The results are relatively similar to results from the Kruskal–Wallis and Dunn tests, and to those from ordinal regression. Here, the p-value for the global test by ART anova is lower than that from the Kruskal–Wallis test.
Data = read.table(header=TRUE, stringsAsFactors=TRUE, text="
Speaker Likert
Pooh 3
Pooh 5
Pooh 4
Pooh 4
Pooh 4
Pooh 4
Pooh 4
Pooh 4
Pooh 5
Pooh 5
Piglet 2
Piglet 4
Piglet 2
Piglet 2
Piglet 1
Piglet 2
Piglet 3
Piglet 2
Piglet 2
Piglet 3
Tigger 4
Tigger 4
Tigger 4
Tigger 4
Tigger 5
Tigger 3
Tigger 5
Tigger 4
Tigger 4
Tigger 3
")
### Order levels of the factor; otherwise R will alphabetize them
Data$Speaker = factor(Data$Speaker,
levels=unique(Data$Speaker))
### Check the data frame
library(psych)
headTail(Data)
str(Data)
summary(Data)
### Aligned ranks anova
library(ARTool)
model = art(Likert ~ Speaker,
data = Data)
anova(model)
Analysis of Variance of Aligned Rank
Transformed Data
Table Type: Anova Table (Type III tests)
Model: No Repeated Measures (lm)
Response: art(Likert)
Df Df.res F value Pr(>F)
1 Speaker 2 27 18.702 8.0005e-06 ***
### Post-hoc comparisons
model.lm = artlm(model, "Speaker")
library(emmeans)
marginal = emmeans(model.lm,
~ Speaker)
pairs(marginal,
adjust = "tukey")
contrast estimate SE df t.ratio p.value
Pooh - Piglet 14.1 2.51 27 5.619 <.0001
Pooh - Tigger 1.8 2.51 27 0.717 0.7555
Piglet - Tigger -12.3 2.51 27 -4.901 0.0001
P value adjustment: tukey method for comparing a family of 3 estimates
Sum = as.data.frame(marginal)
library(rcompanion)
cldList(p.value ~ contrast, data=Sum)
Group Letter MonoLetter
1 Pooh a a
2 Piglet b b
3 Tigger a a
Partial eta-squared
Result = anova(model)
Result$part.eta.sq = with(Result, `Sum Sq`/(`Sum Sq` + `Sum Sq.res`))
Result
Analysis of Variance of Aligned Rank Transformed Data
Df Df.res F value Pr(>F) part.eta.sq
1 Speaker 2 27 18.702 8.0005e-06 0.58077 ***
Efron’s pseudo r-squared
library(rcompanion)
efronRSquared(actual = Data$Likert,
residual = model$residuals)
EfronRSquared
0.614
Repeated measures example
The following example addresses the data from the Friedman Test chapter. Results are relatively similar to results from the Friedman and Conover tests, and to those from ordinal regression. Here, the p-value for the global test by ART anova is lower than that from the Friedman test.
Data = read.table(header=TRUE, stringsAsFactors=TRUE, text="
Instructor Rater Likert
'Bob Belcher' a 4
'Bob Belcher' b 5
'Bob Belcher' c 4
'Bob Belcher' d 6
'Bob Belcher' e 6
'Bob Belcher' f 6
'Bob Belcher' g 10
'Bob Belcher' h 6
'Linda Belcher' a 8
'Linda Belcher' b 6
'Linda Belcher' c 8
'Linda Belcher' d 8
'Linda Belcher' e 8
'Linda Belcher' f 7
'Linda Belcher' g 10
'Linda Belcher' h 9
'Tina Belcher' a 7
'Tina Belcher' b 5
'Tina Belcher' c 7
'Tina Belcher' d 8
'Tina Belcher' e 8
'Tina Belcher' f 9
'Tina Belcher' g 10
'Tina Belcher' h 9
'Gene Belcher' a 6
'Gene Belcher' b 4
'Gene Belcher' c 5
'Gene Belcher' d 5
'Gene Belcher' e 6
'Gene Belcher' f 6
'Gene Belcher' g 5
'Gene Belcher' h 5
'Louise Belcher' a 8
'Louise Belcher' b 7
'Louise Belcher' c 8
'Louise Belcher' d 8
'Louise Belcher' e 9
'Louise Belcher' f 9
'Louise Belcher' g 8
'Louise Belcher' h 10
")
### Order levels of the factor; otherwise R will alphabetize them
Data$Instructor = factor(Data$Instructor,
levels=unique(Data$Instructor))
### Check the data frame
library(psych)
headTail(Data)
str(Data)
summary(Data)
### Aligned ranks anova
library(ARTool)
model = art(Likert ~ Instructor + (1|Rater),
data = Data)
anova(model)
Analysis of Variance of Aligned Rank Transformed Data
Table Type: Analysis of Deviance Table (Type III Wald F tests with
Kenward-Roger df)
Model: Mixed Effects (lmer)
Response: art(Likert)
F Df Df.res Pr(>F)
1 Instructor 16.052 4 28 6.0942e-07 ***
### Post-hoc comparisons
marginal = art.con(model, "Instructor")
marginal
contrast estimate SE df t.ratio p.value
Bob Belcher - Linda Belcher -13.562 3.23 28 -4.202 0.0021
Bob Belcher - Tina Belcher -12.750 3.23 28 -3.950 0.0040
Bob Belcher - Gene Belcher 4.312 3.23 28 1.336 0.6717
Bob Belcher - Louise Belcher -16.125 3.23 28 -4.996 0.0003
Linda Belcher - Tina Belcher 0.812 3.23 28 0.252 0.9991
Linda Belcher - Gene Belcher 17.875 3.23 28 5.538 0.0001
Linda Belcher - Louise Belcher -2.562 3.23 28 -0.794 0.9302
Tina Belcher - Gene Belcher 17.062 3.23 28 5.287 0.0001
Tina Belcher - Louise Belcher -3.375 3.23 28 -1.046 0.8319
Gene Belcher - Louise Belcher -20.438 3.23 28 -6.332 <.0001
Degrees-of-freedom method: kenward-roger
P value adjustment: tukey method for comparing a family of 5
estimates
Sum = as.data.frame(marginal)
library(rcompanion)
cldList(p.value ~ contrast, data=Sum)
Group Letter MonoLetter
1 BobBelcher a a
2 LindaBelcher b b
3 TinaBelcher b b
4 GeneBelcher a a
5 LouiseBelcher b b
Partial eta-squared
For mixed effects models, the partial eta-squared can be calculated from the F values and the degrees of freedom.
Result = anova(model)
Result$part.eta.sq = with(Result, `F` * `Df` / (`F` * `Df` + `Df.res`))
Result
Analysis of Variance of Aligned Rank Transformed Data
Table Type: Analysis of Deviance Table (Type III Wald F tests with
Kenward-Roger df)
Model: Mixed Effects (lmer)
Response: art(Likert)
F Df Df.res Pr(>F) part.eta.sq
1 Instructor 16.052 4 28 6.0942e-07 0.69633 ***
Efron’s pseudo r-squared
At the time of writing, the residuals in the ARTool object don’t reflect the random effects in the model. The effect is that Efron’s pseudo r-squared for an ARTool object will be equal to the r-squared for a similar linear model ignoring any random effects in the ARTool model.
library(rcompanion)
efronRSquared(actual = Data$Likert,
residual = model$residuals)
EfronRSquared
0.51
model.lm = lm(Likert ~ Instructor,
data = Data)
summary(model.lm)$r.squared
[1] 0.5101182
Optional: Plot of medians and confidence intervals for midichlorians data
library(rcompanion)
Sum = groupwiseMedian(Midichlorians ~ Tribe + Location,
data=Data,
bca=FALSE, percentile=TRUE)
Sum
Tribe Location n Median Conf.level Percentile.lower Percentile.upper
1 Jedi Burlington 3 20
0.95 20 22
2 Jedi Northampton 3 8 0.95 8 10
3 Jedi Olympia 3 12 0.95 10 15
4 Jedi Ventura 3 15 0.95 15 18
5 Sith Burlington 3 22 0.95 22 25
6 Sith Northampton 3 15 0.95 13 17
7 Sith Olympia 3 4 0.95 4 5
8 Sith Ventura 3 11 0.95 9 12
### Order the levels for printing
Sum$Location = factor(Sum$Location,
levels=c("Olympia", "Ventura",
"Northampton", "Burlington"))
Sum$Tribe = factor(Sum$Tribe,
levels=c("Jedi", "Sith"))
### Plot
library(ggplot2)
pd = position_dodge(0.4) ### How much to jitter
the points on the plot
png(filename = "Rplot01.png",
width = 5,
height = 5,
units = "in",
res = 300)
ggplot(Sum,
aes(x = Location,
y = Median,
color = Tribe)) +
geom_point(shape = 15,
size = 4,
position = pd) +
geom_errorbar(aes(ymin = Percentile.lower,
ymax = Percentile.upper),
width = 0.2,
size = 0.7,
position = pd) +
theme_bw() +
theme(axis.title = element_text(face = "bold"),
axis.text = element_text(face = "bold"),
plot.caption = element_text(hjust = 0)) +
ylab("Median midichlorian count") +
ggtitle ("Midichlorian counts for Jedi and Sith",
subtitle = "In four U.S. cities") +
labs(caption = paste0("\nMidichlorian counts for two tribes
across ",
"four locations. Boxes indicate
\n",
"the median. ",
"Error bars indicate the 95% confidence
",
"interval ",
"of the median."),
hjust=0.5) +
scale_color_manual(values = c("blue", "red"))
dev.off()
References
Cohen, J. 1988. Statistical Power Analysis for the Behavioral Sciences, 2nd Edition. Routledge.
Elkin, L.A., Kay, M., Higgins, J. and Wobbrock, J.O. (2021). An aligned rank transform procedure for multifactor contrast tests. Proceedings of the ACM Symposium on User Interface Software and Technology (UIST '21). New York: ACM Press, pp. 754–768. [DOI]. faculty.washington.edu/wobbrock/pubs/uist-21.pdf.
Kay, M. 2019. Contrast tests with ART. cran.r-project.org/web/packages/ARTool/vignettes/art-contrasts.html.
Kay, M. 2019. Effect Sizes with ART. cran.r-project.org/web/packages/ARTool/vignettes/art-effect-size.html.
Kay, M. 2019. Package ‘ARTool’. cran.r-project.org/web/packages/ARTool/ARTool.pdf.
Wobbrock, J. O., Findlater, L., Gergle, D., & Higgins, J. J. (2011). The aligned rank transform for nonparametric factorial analyses using only anova procedures. In Conference on Human Factors in Computing Systems (pp. 143–146). faculty.washington.edu/wobbrock/pubs/chi-11.06.pdf.
Wobbrock, J. O., Findlater, L., Gergle, D., Higgins, J. J., & Kay, M. (2018). ARTool: Align-and-rank data for a nonparametric ANOVA. University of Washington. Retrieved from depts.washington.edu/madlab/proj/art/index.html.