## An R Companion for the Handbook of Biological Statistics

Salvatore S. Mangiafico

# G–test of Independence

There are a few different options for performing G-tests of independence in R.  One is the G.test function in the package RVAideMemoire.  Another is the GTest function in the package DescTools.

### When to use it

#### G-test example with functions in DescTools and RVAideMemoire

### --------------------------------------------------------------
### Vaccination example, G-test of independence, pp. 68
69
### --------------------------------------------------------------

Input =("
Injection.area  No.severe  Severe
Thigh           4788       30
Arm             8916       76
")

row.names=1))

Matriz

library(DescTools)

GTest(Matriz,
correct="none")            # "none" "williams" "yates"

Log likelihood ratio (G-test) test of independence without correction

G = 2.1087, X-squared df = 1, p-value = 0.1465

library(RVAideMemoire)

G.test(Matriz)

G = 2.1087, df = 1, p-value = 0.1465   # Note values differ from

# the Handbook

# for this example

#     #     #

Null hypothesis

How the test works

See the Handbook for information on these topics.

### Post-hoc tests

For the following example of post-hoc pairwise testing, we’ll use the pairwise.G.test function from the package RVAideMemoire to make the task easier.  Then we’ll use pairwise.table in the native stats package as an alternative.

#### Post-hoc pairwise G-tests with RVAideMemoire

### --------------------------------------------------------------
### Post-hoc example, G-test of independence, pp. 69
70
### --------------------------------------------------------------

Input =("
Supplement     No.cancer  Cancer
'Selenium'     8177       575
'Vitamin E'    8117       620
'Selenium+E'   8147       555
'Placebo'      8167       529
")

row.names=1))

Matriz

library(RVAideMemoire)

G.test(Matriz)

G = 7.7325, df = 3, p-value = 0.05188

library(RVAideMemoire)

pairwise.G.test(Matriz,
p.method = "none")           # Can adjust p-values;

Selenium Vitamin E Selenium+E

Vitamin E  0.168    -         -

Selenium+E 0.606    0.058     -

Placebo    0.187    0.007     0.422

#### Post-hoc pairwise G-tests with pairwise.table

As is, this function works on a matrix with two columns, and compares rows.

### --------------------------------------------------------------
### Post-hoc example, G-test of independence, pp. 69
70
### --------------------------------------------------------------

Input =("
Supplement      No.cancer  Cancer
'Selenium'     8177       575
'Vitamin E'    8117       620
'Selenium+E'   8147       555
'Placebo'      8167       529
")

row.names=1))

Matriz

library(DescTools)

GTest(Matriz,
correct="none")

Log likelihood ratio (G-test) test of independence without correction

G = 7.7325, X-squared df = 3, p-value = 0.05188

FUN = function(i,j){
GTest(matrix(c(Matriz[i,1], Matriz[i,2],
Matriz[j,1], Matriz[j,2]),
nrow=2,
byrow=TRUE),
correct="none")\$ p.value   # "none" "williams" "yates"
}

pairwise.table(FUN,
rownames(Matriz),

Selenium   Vitamin E Selenium+E

Vitamin E  0.1677388          NA         NA

Selenium+E 0.6060951 0.058385135         NA

Placebo    0.1866826 0.007004601  0.4215013

#     #     #

Assumptions

See the Handbook for information on this topic.

### Examples

#### G-tests with DescTools and RVAideMemoire

### --------------------------------------------------------------
### Helmet example, G-test of independence, p. 72
### --------------------------------------------------------------

Input =("
Helemt    372          4715
No.helmet 267          1391
")

row.names=1))

Matriz

library(DescTools)

GTest(Matriz,
correct="none")            # "none" "williams" "yates"

Log likelihood ratio (G-test) test of independence without correction

G = 101.54, X-squared df = 1, p-value < 2.2e-16

library(RVAideMemoire)

G.test(Matriz)

G = 101.5437, df = 1, p-value < 2.2e-16

#     #     #

### --------------------------------------------------------------
### Gardemann apolipoprotein example, G-test of independence,
###   p. 72
### --------------------------------------------------------------

Input =("
Genotype  No.disease Coronary.disease
ins.ins   268        807
ins.del   199        759
del.del    42        184
")

row.names=1))

Matriz

library(DescTools)

GTest(Matriz,
correct="none")            # "none" "williams" "yates"

Log likelihood ratio (G-test) test of independence without correction

G = 7.3008, X-squared df = 2, p-value = 0.02598

library(RVAideMemoire)

G.test(Matriz)

G = 7.3008, df = 2, p-value = 0.02598

#     #     #

Graphing the results

Graphing is discussed above in the “Chi-square Test of Independence” section.

Similar tests

Chi-square vs. G–test

See the Handbook for information on these topics.  Fisher’s exact test, chi-square test, and McNemar’s test are discussed elsewhere in this book.

### How to do the test

#### G-test of independence with data as a data frame

In the following example, the data is read in as a data frame, and the xtabs function is used to tabulate the data and convert them to a contingency table.

### --------------------------------------------------------------
### Gardemann apolipoprotein example, G-test of independence,
###      SAS example, pp. 74
75
###      Example using cross-tabulation
### --------------------------------------------------------------

Input =("
Genotype   Health       Count
ins-ins   no_disease   268
ins-ins   disease      807
ins-del   no_disease   199
ins-del   disease      759
del-del   no_disease    42
del-del   disease      184
")

###  Cross-tabulate the data

Data.xtabs = xtabs(Count ~ Genotype + Health,
data=Data.frame)

Data.xtabs

Health

Genotype  disease no_disease

del-del     184         42

ins-del     759        199

ins-ins     807        268

summary(Data.xtabs)                # includes N and factors

Number of cases in table: 2259

Number of factors: 2

###  G-tests

library(DescTools)

GTest(Data.xtabs,
correct="none")            # "none" "williams" "yates"

Log likelihood ratio (G-test) test of independence without correction

G = 7.3008, X-squared df = 2, p-value = 0.02598

library(RVAideMemoire)

G.test(Data.xtabs)

G = 7.3008, df = 2, p-value = 0.02598

#     #     #

Power analysis

To calculate power or required samples, follow examples in the “Chi-square Test of Independence” section.