Split comma-separated strings in a column into separate rows

Several alternatives:

1) two ways with :

library(data.table)
# method 1 (preferred)
setDT(v)[, lapply(.SD, function(x) unlist(tstrsplit(x, ",", fixed=TRUE))), by = AB
         ][!is.na(director)]
# method 2
setDT(v)[, strsplit(as.character(director), ",", fixed=TRUE), by = .(AB, director)
         ][,.(director = V1, AB)]

2) a / combination:

library(dplyr)
library(tidyr)
v %>% 
  mutate(director = strsplit(as.character(director), ",")) %>%
  unnest(director)

3) with only: With tidyr 0.5.0 (and later), you can also just use separate_rows:

separate_rows(v, director, sep = ",")

You can use the convert = TRUE parameter to automatically convert numbers into numeric columns.

4) with base R:

# if 'director' is a character-column:
stack(setNames(strsplit(df$director,','), df$AB))

# if 'director' is a factor-column:
stack(setNames(strsplit(as.character(df$director),','), df$AB))

Leave a Comment