How to detect the right encoding for read.csv?

First of all based on more general question on StackOverflow it is not possible to detect encoding of file in 100% certainty. I’ve struggle this many times and come to non-automatic solution: Use iconvlist to get all possible encodings: codepages <- setNames(iconvlist(), iconvlist()) Then read data using each of them x <- lapply(codepages, function(enc) try(read.table(“encoding.asc”, … Read more

read.csv, header on first line, skip second line [duplicate]

This should do the trick: all_content = readLines(“file.csv”) skip_second = all_content[-2] dat = read.csv(textConnection(skip_second), header = TRUE, stringsAsFactors = FALSE) The first step using readLines reads the entire file into a list, where each item in the list represents a line in the file. Next, you discard the second line using the fact that negative … Read more

Why am I getting X. in my column names when reading a data frame?

read.csv() is a wrapper around the more general read.table() function. That latter function has argument check.names which is documented as: check.names: logical. If ‘TRUE’ then the names of the variables in the data frame are checked to ensure that they are syntactically valid variable names. If necessary they are adjusted (by ‘make.names’) so that they … Read more