[banner]

A Python Companion to Extension Program Evaluation

Salvatore S. Mangiafico

Mood’s Median Test for Two-sample Data

Two-sample Mood’s median test in SAEPER

 

For a discussion of this test, see the corresponding chapter in Summary and Analysis of Extension Program Evaluation in R (rcompanion.org/handbook/F_05.html).

 

Importing packages in this chapter

 

The following commands will import required packages used in this chapter from libraries and assign them common aliases.  You may need install these libraries first.

 

import io

 

import os

 

import numpy as np

 

import scipy.stats as stats

 

import pandas as pd

 

import matplotlib.pyplot as plt

 

import pingouin as pg

 

import seaborn as sns

 

import math

 

 

Setting your working directory

 

You may wish to set your working directory for exported plots.

 

os.chdir("C:/Users/Sal Mangiafico/Desktop")

 

print(os.getcwd())

 

 

Example of Mood’s median test

 

Data = pd.read_table(sep="\\s+", filepath_or_buffer=io.StringIO("""
 Speaker  Likert
 Pooh      3
 Pooh      5
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      4
 Pooh      5
 Pooh      5
 Piglet    2
 Piglet    4
 Piglet    2
 Piglet    2
 Piglet    1
 Piglet    2
 Piglet    3
 Piglet    2
 Piglet    2
 Piglet    3
"""))



Summarize data treating Likert scores as numeric


Summary = Data.groupby('Speaker')['Likert'].describe()

 

print(Summary)

 

         count  mean       std  min  25%  50%   75%  max

Speaker                                                

Pooh      10.0   4.2  0.632456  3.0  4.0  4.0  4.75  5.0

Piglet    10.0   2.3  0.823273  1.0  2.0  2.0  2.75  4.0

 

### The 50% percentile is the median for each group

 

 

Mood’s median test

 

Using scipy.stats

 

Pooh = Data['Likert'][Data['Speaker']=='Pooh']

 

Piglet = Data['Likert'][Data['Speaker']=='Piglet']

 

stats.median_test(Piglet, Pooh)

 

MedianTestResult(statistic=9.8, pvalue=0.0017451186995289028, median=3.5, table=array([[1, 9], [9, 1]], dtype=int64))

 

 

The output above reports the test statistic and p-value.  The median listed is the overall median of observations pooled across groups.  The table is the contingency table that used for the chi-square test for Mood’s median test.

 

Effect size measurements

 

Difference in medians

 

One effect size statistic is the difference in median between groups.  From the summary statistics above, the difference in medians is 2, or –2 if we use the convention of First Group – Second Group, assuming Piglet is the first group.

 

Median1 = 2

 

Median2 = 4

 

Median1 - Median2

 

-2

 

 

Rank biserial correlation coefficient and common language effect size statistic

 

The rank biserial correlation coefficient or the common language effect size statistic may also be an appropriate effect size statistic.

 

To perform this test, we first need to extract the data as arrays of numbers.  Here we’ll treat Piglet as the first group to match the R output.

 

Pooh = Data['Likert'][Data['Speaker']=='Pooh']

 

Piglet = Data['Likert'][Data['Speaker']=='Piglet']

 

pg.mwu(Piglet, Pooh)

 

     U-val alternative     p-val  RBC  CLES

MWU    5.0   two-sided  0.000471 -0.9  0.05

 

### Note that the output includes the rank biserial correlation coefficient, RBC, and

###   the common languange effect size statistic, CLES, as effect size statistics.

 

 

Mangiafico’s d

 

A statistic similar to Cohen’s d can be calculated by dividing the difference in medians by the median absolute deviation. 

 

The following uses a different example.

 

A     = np.array([1,2,2,2,2,3,4,5])

B     = np.array([2,3,4,4,4,5,5,5])

 

madA = stats.median_abs_deviation(A, scale='normal')

 

round(madA, 3)

 

0.741

 

 

madB = stats.median_abs_deviation(B, scale='normal')

 

round(madB, 3)

 

1.483

 

 

madPooled = math.sqrt((madA ** 2 + madB ** 2) / 2)

 

round(madPooled, 3)

 

1.172

 

 

mangiaficoD = ( np.median(A) - np.median(B) ) / madPooled

 

round(mangiaficoD, 3)

 

-1.706

 

 

Manual calculation of Mood’s median test

 

medianOverall = Data['Likert'].median()

 

medianOverall

 

3.5

 

 

PoohHigher = sum(Data['Likert'][Data['Speaker']=='Pooh'] > medianOverall)

 

PoohHigher

 

 9

 

 

PoohLower = sum(Data['Likert'][Data['Speaker']=='Pooh'] <= medianOverall)

 

PoohLower

 

 1

 

 

PigletHigher = sum(Data['Likert'][Data['Speaker']=='Piglet'] > medianOverall)

 

PigletHigher

 

  1

 

 

PigletLower = sum(Data['Likert'][Data['Speaker']=='Piglet'] <= medianOverall)

 

PigletLower

 

  9

 

 

So the following the contingency table would be used by Mood’s median test.

 

        LessThanMu  GreaterThanEqualMu

Pooh             1                   9

Piglet           9                   1

 

 

Matrix = np.array([[1, 9], [9, 1]])

 

stats.chi2_contingency(Matrix)

 

Chi2ContingencyResult(statistic=9.8, pvalue=0.0017451186995289028, dof=1

 

### Note this is the same as the output from Pingouin above