How to change formats of multiple variables in one dataset?

When importing a .csv file (I assume it is what you did), you must be careful with some stuff:

  • Columns containing numbers are transformed into characters if there is one or more cells marked as #ERROR or #DIV/0 or any other NA strings that have not been clearly mentioned as such. Sometimes, with analytical data, you get detection limit results such as <0.02 and it is interpreted as text.
  • Date in character is rather predictable and often happens when importing.

Anyway, if you need to force some columns to specific classes when importing a .csv file, there is the very useful colClasses argument. When using NA, R automatically picks the best format. Try something like this:

df <- read.csv(file="input.csv", na.strings=c("", "#REF", "#DIV/0"), colClasses=c(Date, NA, NA, NA, NA, NA)) 

Leave a Comment