[banner]

A Python Companion to Extension Program Evaluation

Salvatore S. Mangiafico

Two-sample Mann–Whitney U Test

Two-sample paired Wilcoxon signed rank test in SAEPER

 

For a discussion of this test, see the corresponding chapter in Summary and Analysis of Extension Program Evaluation in R (rcompanion.org/handbook/F_06.html).

 

Importing packages in this chapter

 

The following commands will import required packages used in this chapter from libraries and assign them common aliases.  You may need install these libraries first.

 

import io

 

import os

 

import numpy as np

 

import scipy.stats as stats

 

import pandas as pd

 

import matplotlib.pyplot as plt

 

import pingouin as pg

 

import seaborn as sns

 

 

Setting your working directory

 

You may wish to set your working directory for exported plots.

 

os.chdir("C:/Users/Sal Mangiafico/Desktop")

 

print(os.getcwd())

 

 

Example of paired Wilcoxon signed rank test

 

Data = pd.read_table(sep="\\s+", filepath_or_buffer=io.StringIO("""

 Speaker  Time  Student  Likert

 Pooh      1     a        1

 Pooh      1     b        4

 Pooh      1     c        3

 Pooh      1     d        3

 Pooh      1     e        3

 Pooh      1     f        3

 Pooh      1     g        4

 Pooh      1     h        3

 Pooh      1     i        3

 Pooh      1     j        3

 Pooh      2     a        4

 Pooh      2     b        5

 Pooh      2     c        4

 Pooh      2     d        5

 Pooh      2     e        4

 Pooh      2     f        5

 Pooh      2     g        3

 Pooh      2     h        4

 Pooh      2     i        3

 Pooh      2     j        4
"""))

 

### Convert Speaker, Student, and Time to category type

 

Data['Speaker']  = Data['Speaker'].astype('category')

 

Data['Student']  = Data['Student'].astype('category')

 

Data['Time']  = Data['Time'].astype('category')

 

 

print(Data.info())

 

#   Column   Non-Null Count  Dtype  

---  ------   --------------  -----  

 0   Speaker  20 non-null     category

 1   Time     20 non-null     category

 2   Student  20 non-null     category

 3   Likert   20 non-null     int64

 

 

Number of observations per group

It is helpful to check the data to be sure there is one observation per student per time.

 

pd.crosstab(Data['Time'], Data['Student'])

 

Student  a  b  c  d  e  f  g  h  i  j

Time                                

1        1  1  1  1  1  1  1  1  1  1

2        1  1  1  1  1  1  1  1  1  1

 

 

Plot the paired data

 

 For plotting and the subsequent analysis, we’ll extract arrays for Time 1 and Time 2.  It’s important at this point that data are ordered so that the first observation in Time 1 is paired with the first observation in Time 2, and so on.

 

Time1 = np.array(Data['Likert'][Data['Time']==1])

 

Time2 = np.array(Data['Likert'][Data['Time']==2])

 

Difference = Time2 - Time1

 

 

Scatter plot with one-to-one line

 

We’ll have to jitter one of the arrays so that all the points will be displayed on the plot.

 

Note that in the scatter plot, the points tend to be above and to the left of the one-to-one line, suggesting that Time 2 tends to have higher values than Time 1.

 

Time1Jitter = Time1 + np.random.normal(0, 0.3, Time2.shape)

 

 

Simple seaborn call

 

sns.scatterplot(x=Time1Jitter, y=Time2, color='#000000')

plt.axline(xy1=(0,0), slope=1)

plt.show()

 

 

Formatting and export as file

 

sns.scatterplot(x=Time1Jitter, y=Time2, color='#000000')

 

plt.title('')

plt.xlabel("\nTime 1")

plt.ylabel('Time 2\n')

plt.xlim(-0.2, 5.2)

plt.ylim(-0.2, 5.2)

plt.axline(xy1=(0,0), slope=1)

plt.tight_layout()

 

plt.savefig('ScatterAndLine.png', format='png', dpi=300)

 

plt.show()

 

Image003

 

 

Bar plot of differences

 

Note that there are higher counts for differences greater than zero than less than zero, suggesting that the difference (Time 2 – Time 1) tends to positive, suggesting that Time 2 tends to have higher values than Time 1.

 

Simple seaborn call

 

sns.countplot(x=Difference)

 

 

Formatting and export as file

 

sns.set_theme(style='darkgrid')

 

plt.figure(figsize=(5, 3.75))

 

sns.countplot(x=Difference)

 

plt.title('')

plt.xlabel("\nDifference in score (Time 2 – Time 1)")

plt.ylabel('Frequency\n')

plt.tight_layout()

 

plt.savefig('BarPlotDifferences.png', format='png', dpi=300)

 

plt.show()

 

Image004

 

 

Paired-samples Wilcoxon signed-rank test

 

Using pingouin

 

pg.wilcoxon(x=Time1, y=Time2)

 

          W-val alternative     p-val       RBC  CLES

Wilcoxon    3.5   two-sided  0.023552 -0.844444  0.16

 

 

Note that the effect size statistic, rank biserial correlation coefficient, RBC, is reported by default.

 

Using pysci.stats

 

Note, that the continuty correction is not used by default, whereas it is used by default in pingouin.

 

stats.wilcoxon(Time1, Time2, correction=True)

 

WilcoxonResult(statistic=3.5, pvalue=0.020041916312799807)