Python Handbook: Mood’s Median Test for Two-sample Data

Two-sample Mood’s median test in SAEPER

For a discussion of this test, see the corresponding chapter in Summary and Analysis of Extension Program Evaluation in R (rcompanion.org/handbook/F_05.html).

Importing packages in this chapter

The following commands will import required packages used in this chapter from libraries and assign them common aliases. You may need install these libraries first.

import io

import os

import numpy as np

import scipy.stats as stats

import pandas as pd

import matplotlib.pyplot as plt

import pingouin as pg

import seaborn as sns

import math

Setting your working directory

You may wish to set your working directory for exported plots.

os.chdir("C:/Users/Sal Mangiafico/Desktop")

print(os.getcwd())

Example of Mood’s median test

Data = pd.read_table(sep="\\s+", filepath_or_buffer=io.StringIO("""
Speaker Likert
Pooh      3
Pooh      5
Pooh      4
Pooh      4
Pooh      4
Pooh      4
Pooh      4
Pooh      4
Pooh      5
Pooh      5
Piglet    2
Piglet    4
Piglet    2
Piglet    2
Piglet    1
Piglet    2
Piglet    3
Piglet    2
Piglet    2
Piglet    3
"""))

Summarize data treating Likert scores as numeric

Summary = Data.groupby('Speaker')['Likert'].describe()

print(Summary)

count mean std min 25% 50% 75% max

Speaker

Pooh 10.0 4.2 0.632456 3.0 4.0 4.0 4.75 5.0

Piglet 10.0 2.3 0.823273 1.0 2.0 2.0 2.75 4.0

### The 50% percentile is the median for each group

Mood’s median test

Using scipy.stats

Pooh = Data['Likert'][Data['Speaker']=='Pooh']

Piglet = Data['Likert'][Data['Speaker']=='Piglet']

stats.median_test(Piglet, Pooh)

MedianTestResult(statistic=9.8, pvalue=0.0017451186995289028, median=3.5, table=array([[1, 9], [9, 1]], dtype=int64))

The output above reports the test statistic and p-value. The median listed is the overall median of observations pooled across groups. The table is the contingency table that used for the chi-square test for Mood’s median test.

Effect size measurements

Difference in medians

One effect size statistic is the difference in median between groups. From the summary statistics above, the difference in medians is 2, or –2 if we use the convention of First Group – Second Group, assuming Piglet is the first group.

Median1 = 2

Median2 = 4

Median1 - Median2

-2

Rank biserial correlation coefficient and common language effect size statistic

The rank biserial correlation coefficient or the common language effect size statistic may also be an appropriate effect size statistic.

To perform this test, we first need to extract the data as arrays of numbers. Here we’ll treat Piglet as the first group to match the R output.

Pooh = Data['Likert'][Data['Speaker']=='Pooh']

Piglet = Data['Likert'][Data['Speaker']=='Piglet']

pg.mwu(Piglet, Pooh)

U-val alternative p-val RBC CLES

MWU 5.0 two-sided 0.000471 -0.9 0.05

### Note that the output includes the rank biserial correlation coefficient, RBC, and

### the common languange effect size statistic, CLES, as effect size statistics.

Mangiafico’s d

A statistic similar to Cohen’s d can be calculated by dividing the difference in medians by the median absolute deviation.

The following uses a different example.

A = np.array([1,2,2,2,2,3,4,5])

B = np.array([2,3,4,4,4,5,5,5])

madA = stats.median_abs_deviation(A, scale='normal')

round(madA, 3)

0.741

madB = stats.median_abs_deviation(B, scale='normal')

round(madB, 3)

1.483

madPooled = math.sqrt((madA ** 2 + madB ** 2) / 2)

round(madPooled, 3)

1.172

mangiaficoD = ( np.median(A) - np.median(B) ) / madPooled

round(mangiaficoD, 3)

-1.706

Manual calculation of Mood’s median test

medianOverall = Data['Likert'].median()

medianOverall

3.5

PoohHigher = sum(Data['Likert'][Data['Speaker']=='Pooh'] > medianOverall)

PoohHigher

PoohLower = sum(Data['Likert'][Data['Speaker']=='Pooh'] <= medianOverall)

PoohLower

PigletHigher = sum(Data['Likert'][Data['Speaker']=='Piglet'] > medianOverall)

PigletHigher

PigletLower = sum(Data['Likert'][Data['Speaker']=='Piglet'] <= medianOverall)

PigletLower

So the following the contingency table would be used by Mood’s median test.

LessThanMu GreaterThanEqualMu

Pooh 1 9

Piglet 9 1

Matrix = np.array([[1, 9], [9, 1]])

stats.chi2_contingency(Matrix)

Chi2ContingencyResult(statistic=9.8, pvalue=0.0017451186995289028, dof=1

### Note this is the same as the output from Pingouin above

A Python Companion to Extension Program Evaluation

Mood’s Median Test for Two-sample Data