[banner]

An R Companion for the Handbook of Biological Statistics

Salvatore S. Mangiafico

Avoiding Pitfalls in R

Grammar, spelling, and capitalization count

Probably the most common problems in programming in any language are syntax errors, for example, forgetting a comma or misspelling the name of a variable or function. 

 

Be sure to include quotes around names requiring them; also be sure to use straight quotes ( " ) and not the smart quotes that some word processors use automatically.  It is helpful to write your R code in a plain text editor or in the editor window in R Studio.

 

Data types in functions

Probably the biggest cause of problems I had when I first started working with R was trying to feed functions the wrong data type.  For example, if a function asks for the data as a matrix, and you give it a data frame, it won’t work. 

 

A more subtle error I’ve encountered is when a function is expecting a variable to be a factor vector, and it’s really a character (“chr”) vector.

 

For instance if we create a variable in the global environment with the same values as Sex and call it Gender, it will be a character vector.

 

Gender = c("male", "male", "female", "female")

str(Gender)     # What is the structure of this variable?

 

chr [1:4] "male" "male" "female" "female"

 

While in the data frame, Sex was read in as a factor vector by default:

 

str(D1$ Sex)

 

Factor w/ 2 levels "female","male": 2 2 1 1

 

One of the nice things about using R Studio is that it allows you to look at the structure of data frames and other objects in the Environment window.

 

Data types can be converted from one data type to another, but it may not be obvious how to do some conversions.  Functions to convert data types include as.factor, as.numeric, and as.character.

 

Style

There isn’t an established style for programming in R in many respects, such as if variable names should be capitalized.  But there is a Google R Users Style Guide, for those who are interested. I don’t necessarily agree with all the recommendations there.  And in practice, people use different style conventions. google.github.io/styleguide/Rguide.xml.