### When to use this test

The two-sample rank-sum test for paired data is used to compare values for two groups where each observation in one group is paired with one observation in the other group. The distribution of differences in the paired samples should be symmetric in shape.

The test is useful to compare scores on a pre-test vs. scores on a post-test, or scores or ratings from two speakers, two different presentations, or two groups of audiences when there is a reason to pair observations, such as being done by the same rater.

A discussion of paired data can be found in the *Independent
and Paired Values* chapter of this book.

The test is performed with the *wilcox.test* function
with the *paired=TRUE* option.

##### Appropriate data

• Two-sample paired data. That is, one-way data with two groups only, where the observations are paired between groups.

• Dependent variable is ordinal, interval, or ratio

• Independent variable is a factor with two levels. That is, two groups

• The distribution of differences in paired samples is symmetric

##### Hypotheses

• Null hypothesis: The distribution of the differences in paired values is symmetric around zero.

• Alternative hypothesis (two-sided): The distribution of the differences in paired values is not symmetric around zero.

##### Interpretation

Significant results can be reported as e.g. “Values for group A were significantly different from those of group B.”

##### Other notes and alternative tests

If the distribution of differences between paired samples is not symmetrical, the two-sample sign test for paired data can be used. This test is described in the next chapter.

For tests with ordinal dependent variables, cumulative link models or permutation tests, which are described later in this book, may be preferable.

### Packages used in this chapter

The packages used in this chapter include:

• psych

• BSDA

The following commands will install these packages if they are not already installed:

if(!require(psych)){install.packages("psych")}

if(!require(BSDA)){install.packages("BSDA")}

### Two-sample paired rank-sum test example

For this example, imagine we want to compare scores for Pooh between Time 1 and Time 2. Here, we’ve recorded the identity of the student raters, and Pooh’s score for each rater. This allows us to focus on the changes for each rater between Time 1 and Time 2. This makes for a more powerful test than would the Mann–Whitney U test in cases like this where one rater might tend to rate high and another rater might tend to rate low, but there is an overall trend in how raters change their scores between Time 1 and Time 2.

Note in this example we needed to record the identity of the
student rater so that a rater’s score from Time 1 can be paired with their
score from Time 2. If we cannot pair data in this way—for example, if we did
not record the identity of the raters—the data would have to be treated as unpaired,
independent samples, for example like those in the *Two-sample Mann–Whitney U
Test* chapter.

Also note that the data is arranged in long form. In this
form, for this test, the data must be ordered so that the first observation where *Time*
= 1 is paired to the first observation where *Time* = 2, and so on.

Input =("

Speaker Time Student Likert

Pooh 1 a 1

Pooh 1 b 4

Pooh 1 c 3

Pooh 1 d 3

Pooh 1 e 3

Pooh 1 f 3

Pooh 1 g 4

Pooh 1 h 3

Pooh 1 i 3

Pooh 1 j 3

Pooh 2 a 4

Pooh 2 b 5

Pooh 2 c 4

Pooh 2 d 5

Pooh 2 e 4

Pooh 2 f 5

Pooh 2 g 3

Pooh 2 h 4

Pooh 2 i 3

Pooh 2 j 4

")

Data = read.table(textConnection(Input),header=TRUE)

### Check the data frame

library(psych)

headTail(Data)

str(Data)

summary(Data)

### Remove unnecessary objects

rm(Input)

#### Plot the paired data

##### Scatter plot with one-to-one line

Paired data can be visualized with a scatter plot of the paired cases. In the plot below, points that fall above and to the left of the blue line indicate cases for which the value for Time 2 was greater than for Time 1.

Note that the points in the plot are jittered slightly so that points which would fall directly on top of one another can be seen.

First, two new variables, *Time.1* and *Time.2*,
are created by extracting the values of *Likert* for observations with the
*Time* variable equal to 1 or 2, respectively, and then the plot is
produced.

Time.1 = Data$Likert[Data$Time==1]

Time.2 = Data$Likert[Data$Time==2]

plot(Time.1, jitter(Time.2), # jitter offsets
points so you can see them all

pch = 16, # shape of points

cex = 1.0, # size of points

xlim=c(1, 5.5), # limits of x axis

ylim=c(1, 5.5), # limits of y axis

xlab="Time 1",

ylab="Time 2"

)

abline(0,1, col="blue", lwd=2) # line
with intercept of 0 and slope of 1

##### Bar plot of differences

Paired data can also be visualized with a bar chart of differences. In the plot below, bars with a value greater than zero indicate cases for which values for Time 2 are greater than for Time 1.

New variables are first created for *Time.1*, *Time.2*,
and their *Difference*. And then the plot is produced.

Time.1 = Data$Likert[Data$Time==1]

Time.2 = Data$Likert[Data$Time==2]

Difference = Time.2 - Time.1

barplot(Difference, # variable
to plot

col="dark gray", # color of bars

xlab="Observation", # x-axis label

ylab="Difference (Time 2 – Time 1)") # y-axis label

##### Bar plot of differences

A bar plot of differences in paired data can be used to see if the differences in paired observations are symmetrical. It is an assumption of the two-sample paired rank-sum test that the distribution of the differences is symmetrical. For this example, the distribution is relatively symmetrical, suggesting the rank-sum test is appropriate.

Here, new variables are created: *Time.1*, *Time.2*,
*Difference*, and *Diff.f*, which has the same values as *Difference*
but as a factor variable. The *xtabs* function is used to create a count
of values of *Diff.f*. The *barplot* function then uses these
counts.

Time.1 = Data$Likert[Data$Time==1]

Time.2 = Data$Likert[Data$Time==2]

Difference = Time.2 - Time.1

Diff.f = factor(Difference)

XT = xtabs(~ Diff.f)

barplot(XT,

col="dark gray",

xlab="Difference in Likert",

ylab="Frequency")

#### Two-sample paired sign-rank test

Note that if data are in long format, the data must be
ordered so that the first observation of Time 1 is paired to the first
observation of Time 2, and so on, because the *wilcox.test* function will
take the observations in order.

This example uses the formula notation indicating that *Likert*
is the dependent variable and *Time *is the independent variable. The *data=*
option indicates the data frame that contains the variables, and *paired=TRUE*
indicates that the test for paired data should be used. For the meaning of
other options, see *?wilcox.test*.

wilcox.test(Likert ~ Time,

data = Data,

paired =
TRUE,

conf.int = TRUE,

conf.level = 0.95)

Wilcoxon signed rank test with continuity correction

V = 3.5, p-value = 0.02355

alternative hypothesis: true location shift is not equal to 0

### Note the p-value given in the above results

95 percent confidence interval:

-2.000051e+00 -1.458002e-05

### Confidence interval of median or location of differences

### You may get a "cannot compute exact p-value
with ties" error.

### You can ignore this or use the exact=FALSE option.

### Exercises K

1. Considering Pooh’s data for Time 1 and Time 2,

a. What do the plots suggest about the relative value of the
scores for Time 1 and Time 2? Do they suggest that scores increased,
decreased, or stayed the same between Time 1 and Time 2?

b. Is the distribution of the differences between paired
samples relatively symmetrical? Is the two-sample paired ranked-sum test
appropriate in this case?

c. Does the two-sample paired ranked-sum test indicate that
there is a significant difference between Time 1 and Time 2?

d. Practically speaking, what do you conclude? If significant, is the difference between Time 1 and Time 2 of practical importance?

2. Lois Griffin gave proficiency scores to her students in her course on piano
playing for adults. She gave a score for each student for their left hand
playing and right hand playing. She wants to know if students in her class are
more proficient in the right hand, left hand, or if there is no difference in
hands.

Instructor Student Hand Score

'Lois Griffin' a left 8

'Lois Griffin' a right 9

'Lois Griffin' b left 6

'Lois Griffin' b right 5

'Lois Griffin' c left 7

'Lois Griffin' c right 9

'Lois Griffin' d left 6

'Lois Griffin' d right 7

'Lois Griffin' e left 7

'Lois Griffin' e right 7

'Lois Griffin' f left 9

'Lois Griffin' f right 9

'Lois Griffin' g left 4

'Lois Griffin' g right 6

'Lois Griffin' h left 5

'Lois Griffin' h right 8

'Lois Griffin' i left 5

'Lois Griffin' i right 6

'Lois Griffin' j left 7

'Lois Griffin' j right 8

For each of the following, answer the question, and ** show
the output from the analyses you used to answer the question**.

a. Is the distribution of the differences between paired
samples relatively symmetrical? Is the two-sample paired ranked-sum test
appropriate in this case?

b. Does the two-sample paired ranked-sum test indicate that
there is a difference between hands? If so, which hand received higher scores?

c. What can you conclude about the results of the plots, summary statistics, and statistical test?

d. Practically speaking, what do you conclude? If significant, is the difference between hands of practical importance?

e. What if Lois wanted to change the design of the experiment
so that she could determine if each student were more proficient in one hand or
the other? That is, is student *a *more proficient in left hand or
right? Is student *b* more proficient in left hand or right? How should
she change what data she’s collecting to determine this?