 ## An R Companion for the Handbook of Biological Statistics

Salvatore S. Mangiafico

# Type I, II, and III Sums of Squares

An in-depth discussion of Type I, II, and III sum of squares is beyond the scope of this book, but readers should at least be aware of them.  They come into play in analysis of variance (anova) tables, when calculating sum of squares, F-values, and p-values.

Perhaps most salient point for beginners is that SAS tends to use Type III by default whereas R will use Type I with the anova function.  In R, Type II and Type III tests are accessed through Anova in the car package, as well as through some other functions for other types of analyses.  However, for Type III tests to be correct, the way R codes factors has to be changed from its default with the options(contrasts =… ) function.  Changing this will not affect Type I or Type II tests.

options(contrasts = c("contr.sum", "contr.poly"))

### needed for type III tests

### Default is: options(contrasts = c("contr.treatment", "contr.poly"))

Type I sum of squares are “sequential.”  In essence the factors are tested in the order they are listed in the model.  Type III are “partial.”  In essence, every term in the model is tested in light of every other term in the model.  That means that main effects are tested in light of interaction terms as well as in light of other main effects.  Type II are similar to Type III, except that they preserve the principle of marginality.  This means that main factors are tested in light of one another, but not in light of the interaction term.

When data are balanced and the design is simple, types I, II, and III will give the same results.  But readers should be aware that results will differ for unbalanced data or more complex designs.  The code below gives an example of this.

There are disagreements as to which type should be used routinely in analysis of variance.  In reality, the user should understand what hypothesis she wants to test, and then choose the appropriate tests.  As general advice, I would recommend not using Type I except in cases where you intend to have the effects assessed sequentially.  Beyond that, probably a majority of those in the R community recommend Type II tests, while SAS users are more likely to consider Type III tests.

Some experimental designs will call for using a specified type of sum of squares, for example when you see “/ SS1” or “HTYPE=1” in SAS code.

A couple of online resources may provide some more clarity:

Falk Scholer. ANOVA (and R). goanna.cs.rmit.edu.au/~fscholer/anova.php.

Daniel Wollschläger. Sum of Squares Type I, II, III: the underlying hypotheses, model comparisons, and their calculation in R. www.uni-kiel.de/psychologie/dwoll/r/ssTypes.php.

As a final note, readers should not confuse these sums of squares with “Type I error”, which refers to rejecting a null hypothesis when it is actually true (a false positive), and “Type II error”, which is failing to reject null hypothesis when it actually false (a false negative).

### --------------------------------------------------------------
### Example of different results for Type I, II, III SS
### --------------------------------------------------------------

options(contrasts = c("contr.sum", "contr.poly"))

### needed for type III tests

A        = c("a", "a", "a", "a", "b", "b", "b", "b", "b", "b", "b", "b")
B        = c("x", "y", "x", "y", "x", "y", "x", "y", "x", "x", "x", "x")
C        = c("l", "l", "m", "m", "l", "l", "m", "m", "l", "l", "l", "l")
response = c( 14,  30,  15,  35,  50,  51,  30,  32,  51,  55,  53,  55)

model = lm(response ~ A + B + C + A:B + A:C + B:C)

anova(model)              # Type I tests

library(car)

Anova(model, type="II")   # Type II tests

Anova(model, type="III")  # Type III tests

#     #     #

 Effects and p-values from a hypothetical linear model.  While in this example the p-values are relatively similar, the B effect would not be significant with Type I sum of squares at the alpha = 0.05 level, while it would be with Type II or Type III tests. Effect Type I p-value Type II p-value Type III p-value A < 0.0001 < 0.0001 < 0.0001 B 0.09 0.002 0.001 C 0.0002 0.0004 0.001 A:B 0.0004 0.001 0.001 A:C 0.0003 0.0003 0.0003 B:C 0.2 0.2 0.2