Choosing a statistical test can be a daunting task for those starting out in the analysis of experiments. This chapter provides a table of tests and models covered in this book, as well as some general advice for approaching the analysis of your data.
Plan your experimental design before you collect data
It is important to have an experimental design planned out before you start collecting data, and to have some an idea of how you plan on analyzing the data. One of the most common mistakes people make in doing research is collecting a bunch of data without having thought through what questions they are trying to answer, what specific hypotheses they want to test, and what statistical tests they can use to test these hypotheses.
What is the hypothesis?
The most important consideration in choosing a statistical test is determining what hypothesis you want to test. Or, more generally, what question are you are trying to answer.
Often people have a notion about the purpose of the research they are conducting, but haven’t formulated a specific hypothesis. It is possible to begin with exploratory data analysis, to see what interesting secrets the data wish to say. But ultimately, choosing a statistical test relies on having in mind a specific hypothesis to test.
For example, we may know that our goal is to determine if one curriculum works better than another. But then we must be more specific in our hypothesis. Perhaps we wish to compare the mean of scores that students get on an exam across the different curricula. Then a specific null hypothesis is, There is no difference among the mean of student scores across curricula.
In this example, we identified the dependent variable as Student scores, and the independent variable as Curriculum.
Of course, we might make things more complicated. For example, if the curricula were used in different classrooms, we might want to include Classroom as an independent blocking variable.
What number and type of variables do you have?
To a large extent, the appropriate statistical test for your data will depend upon the number and types of variables you wish to include in the analysis.
Consider the type of dependent variable you wish
to include.
• If it is of interval/ratio type, you can consider
parametric statistics or nonparametric statistics.
• However, if it is an ordinal variable, you
would look toward nonparametric and ordinal regression models.
• Nominal variables arranged in contingency
tables can be analyzed with chi-square and similar tests. Nominal
dependent variables can be related to independent variables with logistic
regression.
• Count data dependent variables can be related
to independent variables with Poisson regression and related models.
The number and type of independent variables will also be taken into account. As will whether there are paired observations or random blocking variables.
The table below lists the tests in this book according to their number and types of variables.
Note that each test has its own set of assumptions for appropriate data, which should be assessed before proceeding with the analysis.
Also note that the tests in this book cover cases with a single dependent variable only. There are other statistical tests, included under the umbrella of multivariate statistics that can analyze multiple dependent variables simultaneously. These include multivariate analysis of variance (MANOVA), canonical correlation, and discriminant function analysis.
The “References” and “Optional readings” sections of this chapter includes a few other guides to choosing statistical tests.
Test |
DV type, or variable type when there is no DV |
DV |
IV type |
Number of IV |
Levels in IV |
Test type |
One-sample Wilcoxon |
Ordinal or interval/ratio |
Independent |
Single default value |
N/A |
N/A |
Nonparametric |
Sign test for one-sample |
Ordinal or interval/ratio |
Independent |
Single default value |
N/A |
N/A |
Nonparametric |
Two-sample Mann–Whitney |
Ordinal or interval/ratio |
Independent |
Nominal |
1 |
2 |
Nonparametric |
Mood’s median test for two-sample |
Ordinal or interval/ratio |
Independent |
Nominal |
1 |
2 |
Nonparametric |
Two-sample paired rank-sum |
Ordinal or interval/ratio |
Paired |
Nominal |
1, or 2 when one is blocking |
2 |
Nonparametric |
Sign test for two-sample paired |
Ordinal or interval/ratio |
Paired |
Nominal |
1, or 2 when one is blocking |
2 |
Nonparametric |
Kruskal–Wallis |
Ordinal or interval/ratio |
Independent |
Nominal |
1 |
2 or more |
Nonparametric |
Mood’s median |
Ordinal or interval/ratio |
Independent |
Nominal |
1 |
2 or more |
Nonparametric |
Friedman |
Ordinal or interval/ratio |
Independent blocked, or paired |
Nominal |
2 when one is blocking, in unreplicated complete block design |
2 or more |
Nonparametric |
Quade |
Ordinal or interval/ratio |
Independent blocked, or paired |
Nominal |
2 when one is blocking, in unreplicated complete block design |
2 or more |
Nonparametric |
One-way Permutation Test of Independence |
Ordinal or interval/ratio |
Independent |
Nominal |
1 |
2 or more |
Permutation |
One-way Permutation Test of Symmetry |
Ordinal or interval/ratio |
Independent blocked, or paired |
Nominal |
2 when one is blocking |
2 or more |
Permutation |
Two-sample CLM |
Ordinal |
Independent |
Nominal |
1 |
2 |
Ordinal regression |
Two-sample paired CLMM |
Ordinal |
Paired |
Nominal |
2 when one is blocking |
2 |
Ordinal regression |
One-way ordinal ANOVA CLM |
Ordinal |
Independent |
Nominal |
1 |
2 or more |
Ordinal regression |
One-way repeated ordinal ANOVA CLMM |
Ordinal |
Independent |
Nominal |
2 when one is blocking |
2 or more |
Ordinal regression |
Two-way ordinal ANOVA CLM |
Ordinal |
Independent |
Nominal |
2 |
2 or more |
Ordinal regression |
Two-way repeated ordinal ANOVA CLMM |
Ordinal |
Independent |
Nominal |
3 when one is blocking |
2 or more |
Ordinal regression |
Goodness-of-fit tests for nominal variables • binomial test • multinomial test • G-test goodness-of-fit • Chi-square test goodness-of-fit |
Nominal |
Independent |
Expected counts |
N/A |
Overall: vector of counts and expected proportions |
Nominal |
Association tests for nominal variables • Fisher exact test of association • G-test of association • Chi-square test of association |
Nominal |
Independent |
Nominal |
N/A |
Overall: 2-way contingency table |
Nominal |
Tests for paired nominal data • McNemar • McNemar–Bowker |
Nominal |
Paired |
Nominal |
N/A |
Overall: 2-way marginal contingency table |
Nominal |
Cochran–Mantel–Haenszel |
Nominal |
Independent |
Nominal |
N/A |
Overall: 3-way contingency table |
Nominal |
Cochran’s Q |
Nominal (2 levels only) |
Paired |
Nominal |
2 when one is blocking |
2 or more |
Nominal |
Linear-by-linear |
Ordered nominal (ordinal) |
Independent |
Ordered nominal (ordinal) |
N/A |
Overall: 2-way or 3-way contingency table |
Nominal |
Cochran–Armitage (extended) |
Ordered nominal (ordinal) |
Independent |
Nominal |
N/A |
Overall: 2-way or 3-way contingency table |
Nominal |
Log-linear model (multiway frequency analysis) |
Nominal |
Independent |
Nominal |
N/A |
Overall: contingency table with 2 or dimensions |
Generalized linear model |
Logistic regression (standard) |
Nominal with 2 levels |
Independent |
Interval/ratio or nominal |
1 or more |
2 or more |
Generalized linear model |
Multinomial logistic regression |
Nominal with 2 or more levels |
Independent |
Interval/ratio or nominal |
1 or more |
2 or more |
Generalized linear model |
Mixed-effects logistic regression |
Nominal with 2 levels |
Independent or paired |
Interval/ratio or nominal |
1 or more when one is blocking or random |
2 or more |
Generalized linear model |
One-sample t-test |
Interval/ratio |
Independent |
Single default value |
N/A |
N/A |
Parametric |
Two-sample t-test |
Interval/ratio |
Independent |
Nominal |
1 |
2 |
Parametric |
Paired t-test |
Interval/ratio |
Paired |
Nominal |
1, or 2 when one is blocking |
2 |
Parametric |
One-way ANOVA |
Interval/ratio |
Independent |
Nominal |
1 |
2 or more |
Parametric |
One-way ANOVA with blocks |
Interval/ratio |
Independent |
Nominal |
2 when one is blocking |
2 or more |
Parametric |
One-way ANOVA with random blocks |
Interval/ratio |
Independent |
Nominal |
2 when one is blocking |
2 or more |
Parametric |
Two-way ANOVA |
Interval/ratio |
Independent |
Nominal |
2 |
2 or more |
Parametric |
Repeated measures ANOVA |
Interval/ratio |
Paired across time |
Nominal |
2 or more when one is time effect |
2 or more |
Parametric |
Multiple correlation |
Interval/ratio or ordinal, depending on type selected |
Independent |
Interval/ratio or ordinal, depending on type selected |
1 or more |
Overall: multiple vectors of interval/ratio or ordinal data |
Parametric or nonparametric depending on type selected |
Pearson correlation |
Interval/ratio |
Independent |
Interval/ratio |
1 |
Overall: two vectors of interval/ratio data |
Parametric |
Kendall correlation |
Interval/ratio or ordinal |
Independent |
Interval/ratio or ordinal |
1 |
Overall: two vectors of interval/ratio or ordinal data |
Nonparametric |
Spearman correlation |
Interval/ratio or ordinal |
Independent |
Interval/ratio or ordinal |
1 |
Overall: two vectors of interval/ratio or ordinal data |
Nonparametric |
Linear regression |
Interval/ratio |
Independent |
Interval/ratio |
1 |
N/A |
Parametric |
Polynomial regression |
Interval/ratio |
Independent |
Interval/ratio |
2 or more that are polynomial terms |
N/A |
Parametric |
Nonlinear regression and curvilinear regression |
Interval/ratio |
Independent |
Interval/ratio |
1 |
N/A |
Parametric |
Multiple regression |
Interval/ratio |
Independent |
Interval/ratio |
2 or more |
N/A |
Parametric |
Robust linear regression |
Interval/ratio |
Independent |
Interval/ratio |
1 |
N/A |
Robust parametric |
Kendall–Theil regression |
Interval/ratio |
Independent |
Interval/ratio |
1 |
N/A |
Nonparametric |
Linear plateau and quadratic plateau models |
Interval/ratio |
Independent |
Interval/ratio |
1 |
N/A |
Parametric |
Cate–Nelson analysis |
Interval/ratio |
Independent |
Interval/ratio |
1 |
N/A |
Mostly nonparametric |
Hermite and Poisson regression • Hermite regression • Poisson regression • Negative binomial regression • Zero-inflated regression |
Count |
Independent |
Interval/ratio or nominal |
1 or more |
2 or more |
Generalized linear model |
Beta regression |
Proportion or percentage |
Independent |
Interval/ratio or nominal |
1 or more |
2 or more |
Generalized linear model |
References
[IDRE] Institute for Digital Research and Education. 2015. What statistical analysis should I use? UCLA. www.ats.ucla.edu/stat/stata/whatstat/.
“Choosing a statistical test” in McDonald, J.H. 2014. Handbook of Biological Statistics. www.biostathandbook.com/testchoice.html.
Optional readings
[Video] “Choosing which statistical test to use” from Statistics Learning Center (Dr. Nic). 2014. www.youtube.com/watch?v=rulIUAN0U3w.