[banner]

Summary and Analysis of Extension Program Evaluation in R

Salvatore S. Mangiafico

Introduction to Traditional Nonparametric Tests

Nonparametric test assumptions

 

The traditional nonparametric tests presented in this book are primarily rank-based tests that use the ranks of data instead of their numeric values.

 

Nonparametric tests do not assume that the underlying data have any specific distribution.  However, it is important to understand the assumptions of each specific test before using it.

 

Advantages of nonparametric tests

 

•  The tests presented in this section are relatively common, and your audience is relatively likely to be familiar with them.

•  They are appropriate for interval/ratio or ordinal dependent variables.

 

•  Their nonparametric nature makes them appropriate for interval/ratio data that don’t meet the assumptions of parametric analyses. These include data that are skewed, non-normal, contain outliers, or possibly are censored. Censored data is data where there is an upper or lower limit to values. For example, if ages under 5 are reported as “under 5”.

 

Interpretation of nonparametric tests

 

 In general, these tests determine if there is a systematic difference between two groups.  This may be due to a difference in location (e.g. median) or in the shape or spread of the distribution of the data.  It is therefore appropriate to report significant results as, e.g., “There is a significant difference between Likert scores from the pre-test and the post-test."  Or, "The significant Mann–Whitney test indicates that Likert scores from the two classes come from different populations."

 

For the Mann-Whitney, Kruskal-Wallis, and Friedman tests, if the distributions of the groups have the same shape and spread, then it can be assumed that the difference between groups is a difference in medians.  Otherwise, the difference is a difference in distributions.

 

In reality, you should look at the distributions of each group in these tests, with histograms or box plots, so that your conclusions can accurately reflect the data.  You don't want to imply that differences between two treatments are differences in medians when they are really differences in the shape or spread of the distributions.  On the other hand, if it really does look like the difference is a difference in location, you want to be clear about this.

 

As a point of interest, Mangiafico (2015) and McDonald (2014) in the “References” section provide an example of a significant Kruskal–Wallis test where the groups have identical medians.

 

Using traditional nonparametric tests with ordinal data

 

Some authors caution against using traditional nonparametric tests with ordinal dependent variables, since many of them were developed for use with continuous (interval/ratio) data.  Other authors argue that, since these tests rank-transform data before analysis and have adjustments for tied ranks, that they are appropriate for ordinal data.  Some authors have further concerns about situations where are likely to be many ties in ranks, such as Likert data.

Using permutation tests and ordinal regression for ordinal data

 

The following sections of this book, Ordinal Tests with Cumulative Link Models and Permutation Tests describe different approach to handling ordinal data that may be a better approach than using the traditional nonparametric tests described in this section, at least in some cases.

 

Using traditional nonparametric tests with interval/ratio data

 

These nonparametric tests are commonly used for interval/ratio data when the data fail to meet the assumptions of parametric analysis. 

 

Some authors discourage using common nonparametric tests for interval/ratio data.

 

•  One issue is the interpretation of the results mentioned above.  That is, often results are incorrectly interpreted as a difference in medians when they are really describing a difference in distributions.

 

•  Another problem is the lack of flexibility in designs these test can handle.  For example, there is no common equivalent for a parametric two-way analysis of variance.

 

•  Finally, these tests may lack power relative to their parametric equivalents.

 

Given these considerations and the fact that that parametric statistics are often relatively robust to minor deviations in their assumptions, some authors argue that it is often better to stick with parametric analyses for interval/ratio data if it’s possible to make them work.

 

References

 

“Kruskal–Wallis Test” in Mangiafico, S.S. 2015. An R Companion for the Handbook of Biological Statistics, version 1.09. rcompanion.org/rcompanion/d_06.html.

 

“Kruskal–Wallis Test” in McDonald, J.H. 2014. Handbook of Biological Statistics. www.biostathandbook.com/kruskalwallis.html.