read.csv doesn’t seem to detect factors in R 4.0.0

As Ronak Shah said in a comment to your question, R 4.0.0 changed the default behavior in how read.table() (and so its wrappers including read.csv()) treats character vectors. There has been a long debate over that issue, but basically stringsAsFactors == T setting was a default since the inception of R because it helped to save memory due to the way factor variables are implemented in R (essentially they are an integer vector with factor level information added on top). There is less of a reason do that nowadays since the memory is much more abundant and this option often produced unintended side effects.

You can read more about your particular issue and also other peculiarities of vectors in R in Chapter 3 of Advanced R by Hadley Wickham. In there he gives two articles that go into great detail on why default behavior was the way it was.
Here is one and here is another. I would also suggest that you check out Hadley’s book if you already have some experience with R, it helped me very much to learn some of the less obvious features of the language.

Leave a Comment