Trimming a huge (3.5 GB) csv file to read into R

My try with readLines. This piece of a code creates csv with selected years.

file_in <- file("in.csv","r")
file_out <- file("out.csv","a")
x <- readLines(file_in, n=1)
writeLines(x, file_out) # copy headers

B <- 300000 # depends how large is one pack
while(length(x)) {
    ind <- grep("^[^;]*;[^;]*; 20(09|10)", x)
    if (length(ind)) writeLines(x[ind], file_out)
    x <- readLines(file_in, n=B)
}
close(file_in)
close(file_out)

Leave a Comment