[banner]

An R Companion for the Handbook of Biological Statistics

Salvatore S. Mangiafico

G–test of Independence

There are a few different options for performing G-tests of independence in R.  One is the G.test function in the package RVAideMemoire.  Another is the GTest function in the package DescTools.

 

When to use it

G-test example with functions in DescTools and RVAideMemoire

 

### --------------------------------------------------------------
### Vaccination example, G-test of independence, pp. 68
69
### --------------------------------------------------------------

Input =("
 Injection.area  No.severe  Severe      
 Thigh           4788       30
 Arm             8916       76
")

Matriz = as.matrix(read.table(textConnection(Input),
                   header=TRUE,
                   row.names=1))

Matriz

library(DescTools)
   
GTest(Matriz,
      correct="none")            # "none" "williams" "yates"

      

Log likelihood ratio (G-test) test of independence without correction

 

G = 2.1087, X-squared df = 1, p-value = 0.1465



library(RVAideMemoire)

G.test(Matriz)

 

G = 2.1087, df = 1, p-value = 0.1465   # Note values differ from

                                        # the Handbook

                                        # for this example

 

 

#     #     #

 

 

Null hypothesis

How the test works

See the Handbook for information on these topics.

 

Post-hoc tests

For the following example of post-hoc pairwise testing, we’ll use the pairwise.G.test function from the package RVAideMemoire to make the task easier.  Then we’ll use pairwise.table in the native stats package as an alternative.

 

Post-hoc pairwise G-tests with RVAideMemoire

 

### --------------------------------------------------------------
### Post-hoc example, G-test of independence, pp. 69
70
### --------------------------------------------------------------

Input =("
 Supplement     No.cancer  Cancer
 'Selenium'     8177       575
 'Vitamin E'    8117       620
 'Selenium+E'   8147       555
 'Placebo'      8167       529
")

Matriz = as.matrix(read.table(textConnection(Input),
                   header=TRUE,
                   row.names=1))

Matriz

library(RVAideMemoire)

G.test(Matriz)

 

G = 7.7325, df = 3, p-value = 0.05188

 

 

library(RVAideMemoire)

pairwise.G.test(Matriz,
                p.method = "none")           # Can adjust p-values;
                                             # see ?p.adjust for options

 

           Selenium Vitamin E Selenium+E

Vitamin E  0.168    -         -        

Selenium+E 0.606    0.058     -        

Placebo    0.187    0.007     0.422 

 

 

Post-hoc pairwise G-tests with pairwise.table

As is, this function works on a matrix with two columns, and compares rows.

 

### --------------------------------------------------------------
### Post-hoc example, G-test of independence, pp. 69
70
### --------------------------------------------------------------

Input =("
Supplement      No.cancer  Cancer
 'Selenium'     8177       575
 'Vitamin E'    8117       620
 'Selenium+E'   8147       555
 'Placebo'      8167       529
")

Matriz = as.matrix(read.table(textConnection(Input),
                   header=TRUE,
                   row.names=1))

Matriz

library(DescTools)   

GTest(Matriz,
      correct="none")

 

Log likelihood ratio (G-test) test of independence without correction

 

G = 7.7325, X-squared df = 3, p-value = 0.05188


FUN = function(i,j){    
      GTest(matrix(c(Matriz[i,1], Matriz[i,2],
                      Matriz[j,1], Matriz[j,2]),
             nrow=2,
             byrow=TRUE),
             correct="none")$ p.value   # "none" "williams" "yates"
            }
  
pairwise.table(FUN,
               rownames(Matriz),
               p.adjust.method="none")       # Can adjust p-values
                                             # See ?p.adjust for options

  

            Selenium   Vitamin E Selenium+E

Vitamin E  0.1677388          NA         NA

Selenium+E 0.6060951 0.058385135         NA

Placebo    0.1866826 0.007004601  0.4215013

 

#     #     #

 

 

Assumptions

See the Handbook for information on this topic.

 

Examples

G-tests with DescTools and RVAideMemoire

 

### --------------------------------------------------------------
### Helmet example, G-test of independence, p. 72
### --------------------------------------------------------------

Input =("
 PSE       Head.injury  Other.injury
 Helemt    372          4715
 No.helmet 267          1391
")

Matriz = as.matrix(read.table(textConnection(Input),
                   header=TRUE,
                   row.names=1))

Matriz

library(DescTools)
 
GTest(Matriz,
      correct="none")            # "none" "williams" "yates"

 

Log likelihood ratio (G-test) test of independence without correction

 

G = 101.54, X-squared df = 1, p-value < 2.2e-16

 


library(RVAideMemoire)

G.test(Matriz)

 

G = 101.5437, df = 1, p-value < 2.2e-16

 

#     #     #

 

 

### --------------------------------------------------------------
### Gardemann apolipoprotein example, G-test of independence,
###   p. 72
### --------------------------------------------------------------

Input =("
 Genotype  No.disease Coronary.disease
 ins.ins   268        807
 ins.del   199        759
 del.del    42        184
")

Matriz = as.matrix(read.table(textConnection(Input),
                   header=TRUE,
                   row.names=1))

Matriz

library(DescTools)
   
GTest(Matriz,
      correct="none")            # "none" "williams" "yates"

   

Log likelihood ratio (G-test) test of independence without correction

 

G = 7.3008, X-squared df = 2, p-value = 0.02598



library(RVAideMemoire)

G.test(Matriz)

 

G = 7.3008, df = 2, p-value = 0.02598

 

#     #     #

 

 

Graphing the results

Graphing is discussed above in the “Chi-square Test of Independence” section.

 

Similar tests

Chi-square vs. G–test

See the Handbook for information on these topics.  Fisher’s exact test, chi-square test, and McNemar’s test are discussed elsewhere in this book.

 

How to do the test

G-test of independence with data as a data frame

In the following example, the data is read in as a data frame, and the xtabs function is used to tabulate the data and convert them to a contingency table.

 

### --------------------------------------------------------------
### Gardemann apolipoprotein example, G-test of independence,
###      SAS example, pp. 74
75
###      Example using cross-tabulation
### --------------------------------------------------------------

Input =("
Genotype   Health       Count
 ins-ins   no_disease   268
 ins-ins   disease      807
 ins-del   no_disease   199
 ins-del   disease      759
 del-del   no_disease    42
 del-del   disease      184
")

Data.frame = read.table(textConnection(Input),header=TRUE)


###  Cross-tabulate the data

Data.xtabs = xtabs(Count ~ Genotype + Health,
                    data=Data.frame)

Data.xtabs

 

         Health

Genotype  disease no_disease

  del-del     184         42

  ins-del     759        199

  ins-ins     807        268

 

 

summary(Data.xtabs)                # includes N and factors

 

Number of cases in table: 2259

Number of factors: 2

 

 

###  G-tests

library(DescTools)
  
GTest(Data.xtabs,
      correct="none")            # "none" "williams" "yates"
     

Log likelihood ratio (G-test) test of independence without correction

 

G = 7.3008, X-squared df = 2, p-value = 0.02598

 


library(RVAideMemoire)

G.test(Data.xtabs)

 

G = 7.3008, df = 2, p-value = 0.02598

 

#     #     #

 

 

Power analysis

To calculate power or required samples, follow examples in the “Chi-square Test of Independence” section.