strsplit
Splitting a string into new rows in R [duplicate]
Try the cSplit function (as you already using @Anandas package). Note that is will return a data.table object, so make sure you have this package installed. You can revert back to data.frame (if you want to) by doing something like setDF(df2) library(splitstackshape) df2 <- cSplit(df1, “Item.Code”, sep = “https://stackoverflow.com/”, direction = “long”) df2 # Country … Read more
Split a string by any number of spaces
Just use strsplit with \\s+ to split on: x <- “10012 —- —- —- —- CAB UNCH CAB” x # [1] “10012 —- —- —- —- CAB UNCH CAB” strsplit(x, “\\s+”)[[1]] # [1] “10012” “—-” “—-” “—-” “—-” “CAB” “UNCH” “CAB” length(.Last.value) # [1] 8 Or, in this case, scan also works: scan(text = x, … Read more
How to use the strsplit function with a period
When using a regular expression in the split argument of strsplit(), you’ve got to escape the . with \\., or use a charclass [.]. Otherwise you use . as its special character meaning, “any single character”. s <- “I.want.to.split” strsplit(s, “[.]”) # [[1]] # [1] “I” “want” “to” “split” But the more efficient method here … Read more
Why does strsplit use positive lookahead and lookbehind assertion matches differently?
I am not sure whether this qualifies as a bug, because I believe this is expected behaviour based on the R documentation. From ?strsplit: The algorithm applied to each input string is repeat { if the string is empty break. if there is a match add the string to the left of the match to … Read more
Chopping a string into a vector of fixed width character elements
Using substring is the best approach: substring(x, seq(1, nchar(x), 2), seq(2, nchar(x), 2)) But here’s a solution with plyr: library(“plyr”) laply(seq(1, nchar(x), 2), function(i) substr(x, i, i+1))
Split delimited strings in a column and insert as new rows [duplicate]
As of Dec 2014, this can be done using the unnest function from Hadley Wickham’s tidyr package (see release notes http://blog.rstudio.org/2014/12/08/tidyr-0-2-0/) > library(tidyr) > library(dplyr) > mydf V1 V2 2 1 a,b,c 3 2 a,c 4 3 b,d 5 4 e,f 6 . . > mydf %>% mutate(V2 = strsplit(as.character(V2), “,”)) %>% unnest(V2) V1 V2 … Read more