Split dataframe by levels of a factor and name dataframes by those levels

In base R, you should use the function split. And split has a default method and one for data.frame. However, I find that split.data.frame is very slow as the number of levels to split on becomes huge. That is,

# inefficient in my opinion
split(df, df$Z)

The above solution will give you the names you ask for as well directly, but will choke on large levels.

And if you’re willing to trade using external packages for speed/efficiency, I’d suggest using data.table package:

require(data.table)
dt <- data.table(df)
oo <- dt[, list(list(.SD)), by = Z]$V1
names(oo) <- unique(dt$Z)

Leave a Comment