[banner]

An R Companion for the Handbook of Biological Statistics

Salvatore S. Mangiafico

Two-way Anova with Robust Estimation

A two-way anova using robust estimators can be performed with the WRS2 package.  Options for estimators are M-estimators, trimmed means, and medians.  This type of analysis is resistant to deviations from the assumptions of the traditional ordinary-least-squares anova, and are robust to outliers.  However, it may not be appropriate for data that deviate too widely from parametric assumptions.

 

The main analysis using M-estimators for a two-way anova is conducted with the pbad2way function in the WRS2 package.   Post-hoc tests can be performed with the mcp2a function in the WRS2 package or with my custom functions pairwiseRobustTest and pairwiseRobustMatrix, which rely on the pb2gen function in WRS2.

 

My custom function groupwise.huber uses the HuberM function in the DescTools package  to determine the Huber M-estimators across groups in a data frame.

 

For more information on robust tests available in the WRS2 package, see:

 

help(package="WRS2")

 

Consult the chapter on Two-way Anova for general consideration about conducting analysis of variance.

 

Packages used in this chapter

The following commands will install these packages if they are not already installed:


if(!require(rcompanion)){install.packages("rcompanion")}
if(!require(ggplot2)){install.packages("ggplot2")}
if(!require(WRS2)){install.packages("WRS2")}
if(!require(multcompView)){install.packages("multcompView")}
if(!require(psych)){install.packages("psych")}


Example


### --------------------------------------------------------------
### Two-way anova with robust estimators, hypothetical data
###   Using WRS2 package
### --------------------------------------------------------------

Input = ("
 Factor.A  Factor.B  Response
 l         x          0.9
 l         y          1.4
 l         x          1.3
 l         y          2.0
 l         x          1.6
 l         y          2.6
 m         x          2.4
 m         y          3.6
 m         x          2.8
 m         y          3.7
 m         x          3.2
 m         y          3.0
 n         x          1.6
 n         y          1.2
 n         x          2.0
 n         y          1.9
 n         x          2.7
 n         y          0.9
")

Data = read.table(textConnection(Input),header=TRUE)

 

Produce Huber M-estimators and confidence intervals by group


library(rcompanion)

Sum = groupwiseHuber(data = Data,
                     group = c("Factor.A", "Factor.B"),
                     var = "Response",
                     conf.level=0.95,
                     conf.type="wald")

Sum

  Factor.A Factor.B n  M.Huber  lower.ci upper.ci

1        l        x 3 1.266667 0.9421910 1.591142

2        l        y 3 2.000000 1.4456385 2.554362

3        m        x 3 2.800000 2.4304256 3.169574

4        m        y 3 3.538805 3.2630383 3.814572

5        n        x 3 2.100000 1.5855743 2.614426

6        n        y 3 1.333333 0.8592063 1.807460

 

 

Interaction plot using summary statistics


library(ggplot2)

pd = position_dodge(.2)

ggplot(Sum, aes(x=Factor.A,
                y=M.Huber,
                color=Factor.B)) +
    geom_errorbar(aes(ymin=lower.ci,
                      ymax=upper.ci),
                   width=.2, size=0.7, position=pd) +
    geom_point(shape=15, size=4, position=pd) +
    theme_bw() +
    theme(
          axis.title.y = element_text(vjust= 1.8),
          axis.title.x = element_text(vjust= -0.5),
          axis.title = element_text(face = "bold")) +
    scale_color_manual(values = c("black", "blue"))

 

 

 

Two-way analysis of variance for M-estimators

The est = "mom" option uses a modified M-estimator for the analysis.  To analyze using medians, use the est= "median" option in the pbad2way function in the WRS2 package.  To analyze using trimmed means, use the t2way function in the WRS2 package.


library(WRS2)

pbad2way(Response ~ Factor.A + Factor.B + Factor.A:Factor.B,
         data = Data,
         est = "mom",    # modified M-estimator
         nboot = 5000)   # number of bootstrap samples
                         # a higher number will take longer to compute

 

pbad2way(formula = Response ~ Factor.A + Factor.B + Factor.A:Factor.B,

    data = Data, est = "mom", nboot = 3000)

 

                  p.value

Factor.A           0.0000

Factor.B           0.3403

Factor.A:Factor.B  0.0460

 

 

Produce post-hoc tests for main effects with mcp2a

 

post = mcp2a(Response ~ Factor.A + Factor.B + Factor.A:Factor.B,
             data = Data,
             est = "mom",    # M-estimator
             nboot = 5000)   # number of bootstrap samples

post$contrasts

post

 

                                             Factor.A1:  Factor.A2:  Factor.A3:

    Factor.A1 Factor.A2 Factor.A3 Factor.B1  Factor.B1   Factor.B1   Factor.B1

l_x         1         1         0         1          1           1           0

l_y         1         1         0        -1         -1          -1           0

m_x        -1         0         1         1         -1           0           1

m_y        -1         0         1        -1          1           0          -1

n_x         0        -1        -1         1          0          -1          -1

n_y         0        -1        -1        -1          0           1           1

 

 

                          V1 ci.lower ci.upper p-value

Factor.A1           -3.18333 -4.20000 -1.60000 0.00000

Factor.A2           -0.16667 -1.70000  1.36667 0.40233

Factor.A3            3.01667  1.40000  4.05000 0.00000

Factor.B1           -0.81667 -2.28333  1.00000 0.22233

Factor.A1:Factor.B1  0.11667 -1.50000  1.16667 0.48033

Factor.A2:Factor.B1 -1.50000 -3.10000  0.00000 0.01767

Factor.A3:Factor.B1 -1.61667 -2.80000  0.00000 0.01433

 

### The Factor.A1 contrast compares l to m; since it is significant,

###   l is significantly different than m.

### The Factor.A2 contrast compares l to n; since it is not significant,

###   l is not significantly different than n.

 

 

Produce post-hoc tests for main effects with pairwiseRobustTest or pairwiseRobustMatrix

 

Table and compact letter display with pairwiseRobustTest

 


### Order groups by median

Data$Factor.A = factor(Data$Factor.A,
                     levels = c("n", "l", "m"))


### Pairwise robust test

library(rcompanion)

PT = pairwiseRobustTest(
     Response ~ Factor.A,
     data = Data,
     est="mom",
     nboot=5000,
     method="fdr")      # adjust p-values; see ?p.adjust for options

PT


  Comparison Statistic p.value p.adjust
1  n - l = 0   0.08333  0.7204   0.7204
2  n - m = 0      -1.4  0.0014   0.0021
3  l - m = 0    -1.483   6e-04   0.0018

### p-values may differ


### Produce compact letter display

library(rcompanion)

cldList(comparison = PT$Comparison,
        p.value    = PT$p.adjust,
        threshold  = 0.05)


  Group Letter MonoLetter
1     n      a         a
2     l      a         a
3     m      b          b


Compact letter display output with pairwiseRobustMatrix

 

### Order groups by median

Data$Factor = factor(Data$Factor,
                     levels = c("n", "l", "m"))


### Pairwise robust tests

library(rcompanion)

PM = pairwiseRobustMatrix(
     Response ~ Factor.A,
     data = Data,
     est="mom",
     nboot=5000,
     method="fdr")      # adjust p-values; see ?p.adjust for options

PM$Adjusted

 

       n      l     m
n 1.0000 0.7128 6e-04
l 0.7128 1.0000 0e+00
m 0.0006 0.0000 1e+00

### p-values may differ

 


library(multcompView)

multcompLetters(PM$Adjusted,
                compare="<",
                threshold=0.05,
                Letters=letters,
                reversed = FALSE)


  n   l   m
"a" "a" "b"


Produce post-hoc tests for interaction effect


### Create a factor which is the interaction of Factor.A and Factor.B


Data$Factor.int = interaction (Data$Factor.A, Data$Factor.B)


### Order groups by median

Data$Factor.int = factor(Data$Factor.int,
                         levels = c("m.y", "m.x", "n.x", "l.y", "n.y", "l.x"))


### Check data frame


library(psych)

headTail(Data)


    Factor.A Factor.B Response Factor.int
8          m        y      3.6        m.y
10         m        y      3.7        m.y
12         m        y        3        m.y
7          m        x      2.4        m.x
...     <NA>     <NA>      ...       <NA>
18         n        y      0.9        n.y
1          l        x      0.9        l.x
3          l        x      1.3        l.x
5          l        x      1.6        l.x


Table and compact letter display with pairwiseRobustTest


library(rcompanion)

PT = pairwiseRobustTest(
     Response ~ Factor.int,
     data = Data,
     est="mom",
     nboot=5000,
     method="fdr")      # adjust p-values; see ?p.adjust for options

PT


      Comparison Statistic p.value p.adjust
1  m.y - m.x = 0     -0.85  0.1348   0.1615
2  m.y - n.x = 0     -1.55       0   0.0000
3  m.y - l.y = 0     -1.65       0   0.0000
4  m.y - n.y = 0    -2.317       0   0.0000
5  m.y - l.x = 0    -2.383       0   0.0000
6  m.x - n.x = 0      -0.7  0.1312   0.1615
7  m.x - l.y = 0       0.8  0.1228   0.1615
8  m.x - n.y = 0     1.467       0   0.0000
9  m.x - l.x = 0    -1.533       0   0.0000
10 n.x - l.y = 0       0.1  0.7798   0.8355
11 n.x - n.y = 0    0.7667  0.1344   0.1615
12 n.x - l.x = 0    0.8333  0.0664   0.1423
13 l.y - n.y = 0   -0.6667    0.14   0.1615
14 l.y - l.x = 0   -0.7333  0.1296   0.1615
15 n.y - l.x = 0  -0.06667   0.944   0.9440

### p-values may differ


### Produce compact letter display

library(rcompanion)

cldList(comparison = PT$Comparison,
        p.value    = PT$p.adjust,
        threshold  = 0.05)


  Group Letter MonoLetter
1   m.y      a        a 
2   m.x     ab        ab
3   n.x     bc         bc
4   l.y     bc         bc
5   n.y      c          c
6   l.x      c          c


Compact letter display output with pairwiseRobustMatrix


### Order groups by median

Data$Factor.int = factor(Data$Factor.int,
                         levels = c("m.y", "m.x", "n.x", "l.y", "n.y", "l.x"))


### Pairwise robust tests

library(rcompanion)

PM = pairwiseRobustMatrix(
     Response ~ Factor.int,
     data = Data,
     est="mom",
     nboot=5000,
     method="fdr")      # adjust p-values; see ?p.adjust for options

PM

 

$Unadjusted
    m.y    m.x   n.x    l.y    n.y    l.x
m.y  NA 0.1312 0.000 0.0000 0.0000 0.0000
m.x  NA     NA 0.126 0.1320 0.0000 0.0000
n.x  NA     NA    NA 0.7638 0.1328 0.0680
l.y  NA     NA    NA     NA 0.1304 0.1408
n.y  NA     NA    NA     NA     NA 0.9318
l.x  NA     NA    NA     NA     NA     NA

$Method
[1] "fdr"

$Adjusted
       m.y    m.x    n.x    l.y    n.y    l.x
m.y 1.0000 0.1625 0.0000 0.0000 0.0000 0.0000
m.x 0.1625 1.0000 0.1625 0.1625 0.0000 0.0000
n.x 0.0000 0.1625 1.0000 0.8184 0.1625 0.1457
l.y 0.0000 0.1625 0.8184 1.0000 0.1625 0.1625
n.y 0.0000 0.0000 0.1625 0.1625 1.0000 0.9318
l.x 0.0000 0.0000 0.1457 0.1625 0.9318 1.0000

### p-values may differ


library(multcompView)

multcompLetters(PM$Adjusted,
                compare="<",
                threshold=0.05,
                Letters=letters,
                reversed = FALSE)

 

m.y  m.x  n.x  l.y  n.y  l.x
 "a" "ab" "bc" "bc"  "c"  "c"

### Note, means are not ordered from largest to smallest


#     #     #