[banner]

An R Companion for the Handbook of Biological Statistics

Salvatore S. Mangiafico

Statistics of Dispersion

Measures of dispersion—such as range, variance, standard deviation, and coefficient of variation—can be calculated with standard functions in the native stats package.  In addition, a function, here called summary.list, can be defined to output whichever statistics are of interest.

 

Introduction

See the Handbook for information on this topic.

 

Example

Statistics of dispersion example

 

### --------------------------------------------------------------
### Statistics of dispersion example, p. 111
### --------------------------------------------------------------

Input =("
Stream                     Fish
 Mill_Creek_1                76
 Mill_Creek_2               102
 North_Branch_Rock_Creek_1   12
 North_Branch_Rock_Creek_2   39
 Rock_Creek_1                55
 Rock_Creek_2                93
 Rock_Creek_3                98
 Rock_Creek_4                53
 Turkey_Branch              102
")

Data = read.table(textConnection(Input),header=TRUE)

 

 

Range

 

range(Data$ Fish, na.rm=TRUE)       

 

[1]  12 102     # Min and max

 

 

max(Data$ Fish, na.rm=TRUE) - min(Data$ Fish, na.rm=TRUE)

 

[1] 90

 

 

Sum of squares

Not included here.

 

Parametric variance

Not included here.

 

Sample variance

 

var(Data$ Fish, na.rm=TRUE)

 

[1] 1029.5

 

 

Standard deviation

 

sd(Data$ Fish, na.rm=TRUE)

 

[1] 32.08582

 

 

Coefficient of variation, as percent

 

sd(Data$ Fish, na.rm=TRUE)/
   mean(Data$ Fish, na.rm=TRUE)*100

 

[1] 45.83689

 

 

Custom function of desired measures of central tendency and dispersion

 

### Note NA’s removed in the following function

summary.list = function(x)list(
 N.with.NA.removed= length(x[!is.na(x)]),
 Count.of.NA= length(x[is.na(x)]),
 Mean=mean(x, na.rm=TRUE),
 Median=median(x, na.rm=TRUE),
 Max.Min=range(x, na.rm=TRUE),
 Range=max(Data$ Fish, na.rm=TRUE) - min(Data$ Fish, na.rm=TRUE),
 Variance=var(x, na.rm=TRUE),
 Std.Dev=sd(x, na.rm=TRUE),
 Coeff.Variation.Prcnt=sd(x, na.rm=TRUE)/mean(x, na.rm=TRUE)*100,
 Std.Error=sd(x, na.rm=TRUE)/sqrt(length(x[!is.na(x)])),
 Quantile=quantile(x, na.rm=TRUE)
)

summary.list(Data$ Fish)

 

$N.with.NA.removed

[1] 9

 

$Count.of.NA

[1] 0

 

$Mean

[1] 70

 

$Median

[1] 76

 

$Range

[1]  12 102

 

$Variance

[1] 1029.5

 

$Std.Dev

[1] 32.08582

 

$Coeff.Variation.Prcnt

[1] 45.83689

 

$Std.Error

[1] 10.69527

 

$Quantile

  0%  25%  50%  75% 100%

  12   53   76   98  102

 

#     #     #

 

 

How to calculate the statistics

Methods are described in the “Example” section above.