## Summary and Analysis of Extension Program Evaluation in R

Salvatore S. Mangiafico

# Paired t-test

The paired t-test is commonly used.  It compares the means of two populations of paired observations by testing if the difference between pairs is statistically different from zero.

##### Appropriate data

•  Two-sample data.  That is, one measurement variable in two groups or samples

•  Dependent variable is interval/ratio, and is continuous

•  Independent variable is a factor with two levels.  That is, two groups

•  Data are paired.  That is, the measurement for each observation in one group can be paired logically or by subject to a measurement in the other group

•  The distribution of the difference of paired measurements is normally distributed

•  Moderate skewness is permissible if the data distribution is unimodal without outliers

##### Hypotheses

•  Null hypothesis:  The population mean of the differences between paired observations is equal to zero.

•  Alternative hypothesis (two-sided): The population mean of the differences between paired observations is not equal to zero.

##### Interpretation

Reporting significant results as “Mean of variable Y for group A was different than that for group B.” or “Variable Y increased from before to after” is acceptable.

##### Other notes and alternative tests

•  The nonparametric analogue for this test is the two-sample paired rank-sum test.

•  Power analysis for the paired t-test can be found at Mangiafico (2015) in the “References” section.

### Packages used in this chapter

The packages used in this chapter include:

•  psych

•  rcompanion

•  lsr

The following commands will install these packages if they are not already installed:

if(!require(psych)){install.packages("psych")}
if(!require(rcompanion)){install.packages("rcompanion")}
if(!require(lsr)){install.packages("lsr")}

### Paired t-test example

In the following example, Dumbland Extension had adult students fill out a financial literacy knowledge questionnaire both before and after completing a home financial management workshop.  Each student’s score before and after was paired by student.

Note in the following data that the students’ names are repeated, so that there is a before score for student a and an after score for student a.

Since the data is in long form, we’ll order by Time, then Student to be sure the first observation for Before is student a and the first observation for After is student a, and so on.

Input = ("
Time    Student  Score
Before  a         65
Before  b         75
Before  c         86
Before  d         69
Before  e         60
Before  f         81
Before  g         88
Before  h         53
Before  i         75
Before  j         73
After   a         77
After   b         98
After   c         92
After   d         77
After   e         65
After   f         77
After   g        100
After   h         73
After   i         93
After   j         75
")

###  Order data by Time and Student

Data = Data[order(Time, Student),]

###  Check the data frame

library(psych)

str(Data)

summary(Data)

### Remove unnecessary objects

rm(Input)

#### Check the number of paired observations

It is helpful to create a table of the counts of observations to be sure that there is one observation for each student for each time period.

xtabs(~ Student + Time,
data = Data)

Time
Student After Before
a     1      1
b     1      1
c     1      1
d     1      1
e     1      1
f     1      1
g     1      1
h     1      1
i     1      1
j     1      1

#### Histogram of difference data

A histogram with a normal curve imposed will be used to check if the paired differences between the two populations is approximately normal in distribution.

First, two new variables, Before and After, are created by extracting the values of Score for observations with the Time variable equal to Before or After, respectively.

Note that for this code to make sense, the first observation for Before is student a and the first observation for After is student a, and so on.

Before = Data\$Score[Data\$Time=="Before"]

After  = Data\$Score[Data\$Time=="After"]

Difference = After - Before

x = Difference

library(rcompanion)

plotNormalHistogram(x,
xlab="Difference (After - Before)")

#### Plot the paired data

##### Scatter plot with one-to-one line

Paired data can visualized with a scatter plot of the paired cases.  In the plot below, points that fall above and to the left of the blue line indicate cases for which the value for After was greater than for Before.

Note that the points in the plot are jittered slightly so that points that would fall directly on top of one another can be seen.

First, two new variables, Before and After, are created by extracting the values of Score for observations with the Time variable equal to Before or After, respectively.

Note that for this code to make sense, the first observation for Before is student a and the first observation for After is student a, and so on.

A variable Names is also created for point labels.

Before = Data\$Score[Data\$Time=="Before"]

After  = Data\$Score[Data\$Time=="After"]

Names  = Data\$Student[Data\$Time=="Before"]

plot(Before, jitter(After),    # jitter offsets points so you can see them all
pch = 16,                 # shape of points
cex = 1.0,                # size of points
xlim=c(50, 110),          # limits of x-axis
ylim=c(50, 110),          # limits of y-axis
xlab="Before",            # label for x-axis
ylab="After")             # label for y-axis

text(Before, After, labels=Names,  # Label location and text

pos=3, cex=1.0)               # Label text position and size

abline(0, 1, col="blue", lwd=2)     # line with intercept of 0 and slope of 1

##### Bar plot of differences

Paired data can also be visualized with a bar chart of differences.  In the plot below, bars with a value greater than zero indicate cases for which the value for After was greater than for Before.

New variables are first created for Before, After, and their Difference.

Note that for this code to make sense, the first observation for Before is student a and the first observation for After is student a, and so on.

A variable Names is also created for bar labels.

Before = Data\$Score[Data\$Time=="Before"]

After  = Data\$Score[Data\$Time=="After"]

Difference = After - Before

Names = Data\$Student[Data\$Time=="Before"]

barplot(Difference,                             # variable to plot
col="dark gray",                        # color of bars
xlab="Observation",                     # x-axis label
ylab="Difference (After – Before)",     # y-axis label

names.arg=Names)                        # labels for bars

#### Paired t-test

Note that the output shows the p-value for the test, and the simple difference in the means for the two groups.

Note that for this test to be conducted correctly, the first observation for Before is student a and the first observation for After is student a, and so on.

t.test(Score ~ Time,
data   = Data,
paired = TRUE)

Paired t-test

t = 3.8084, df = 9, p-value = 0.004163
alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:
4.141247 16.258753

sample estimates:
mean of the differences
10.2

### Effect size

Cohen’s d can be used as an effect size statistic for a paired t-test.  It is calculated as the difference between the means of each group, all divided by the standard deviation of the data.  The standard deviation used could be calculated from the differences between observations, or, for observations across two times, the observations in the “before” group.

It ranges from 0 to infinity, with 0 indicating no effect where the means are equal.  In some versions, Cohen’s d can be positive or negative depending on which mean is greater.

A Cohen’s d of 0.5 suggests that the means differ by one-half the standard deviation of the data.  A Cohen’s d of 1.0 suggests that the means differ by one standard deviation of the data.

#### Interpretation of Cohen’s d

Interpretation of effect sizes necessarily varies by discipline and the expectations of the experiment, but for behavioral studies, the guidelines proposed by Cohen (1988) are sometimes followed.  They should not be considered universal.

 Small Medium Large Cohen’s d 0.2 – < 0.5 0.5 – < 0.8 ≥ 0.8

____________________________

Source: Cohen (1988).

#### Cohen’s d for paired t-test

##### Standard deviation calculated from differences in observations

Note that for this code to make sense, the first observation for Before is student a and the first observation for After is student a, and so on.

library(lsr)

cohensD(Score ~ Time,
data   = Data,
method = "paired")

[1] 1.204314

Or we can calculate the value manually.

Before = Data\$Score[Data\$Time=="Before"]

After  = Data\$Score[Data\$Time=="After"]

Difference = After – Before

mean(Before)

[1] 72.5

mean(After)

[1] 82.7

mean(Before) - mean(After)

[1] -10.2

( mean(Before) - mean(After) ) / sd(Difference)

[1] -1.204314

##### Standard deviation calculated from the “before” observations

Note that for this code to make sense, the “before” observations must correspond to the second level of the Time variable, corresponding to y.sd in the function.

library(lsr)

cohensD(Score ~ Time,
data   = Data,
method = "y.sd")

[1] 0.9174268

( mean(Before) - mean(After) ) / sd(Before)

[1] -0.9174268

“Paired t–test” in McDonald, J.H. 2014. Handbook of Biological Statistics. www.biostathandbook.com/pairedttest.html.

### References

“Paired t–test” in Mangiafico, S.S. 2015. An R Companion for the Handbook of Biological Statistics, version 1.09. rcompanion.org/rcompanion/d_09.html.

### Exercises Q

1. Consider the Dumbland Extension data.  Report for each answer How you know, when appropriate, by reporting the values of the statistic you are using or other information you used.

a.  What was the mean of the differences in score before and after the training?

c.  Is the data distribution for the paired differences reasonably normal?

d.  Was the mean score significantly different before and after the training?

e.  What do you conclude practically?  As appropriate, report the means before and after, the mean difference, the effect size and interpretation, whether the difference is large relative to the scoring system, anything notable on plots, and your practical conclusions.

2. Residential properties in Dougal County rarely need phosphorus for good turfgrass growth.  As part of an extension education program, Early and Rusty Cuyler asked homeowners to report their phosphorus fertilizer use, in pounds of P2O5 per acre, before the program and then one year later.

Date              Homeowner  P2O5
'2014-01-01'      a          0.81
'2014-01-01'      b          0.86
'2014-01-01'      c          0.79
'2014-01-01'      d          0.59
'2014-01-01'      e          0.71
'2014-01-01'      f          0.88
'2014-01-01'      g          0.63
'2014-01-01'      h          0.72
'2014-01-01'      i          0.76
'2014-01-01'      j          0.58
'2015-01-01'      a          0.67
'2015-01-01'      b          0.83
'2015-01-01'      c          0.81
'2015-01-01'      d          0.50
'2015-01-01'      e          0.71
'2015-01-01'      f          0.72
'2015-01-01'      g          0.67
'2015-01-01'      h          0.67
'2015-01-01'      i          0.48
'2015-01-01'      j          0.68

For each of the following, answer the question, and show the output from the analyses you used to answer the question.

a.  What was mean of the differences in P2O5 before and after the training?

b.  Is this an increase or a decrease?

c.  Is the data distribution for the paired differences reasonably normal?

d.  Was the mean P2O5 use significantly different before and after the training?

e.  What do you conclude practically?  As appropriate, report the means before and after, the mean difference, the effect size and interpretation, whether the difference is large relative to the values, anything notable on plots, and your practical conclusions.