R/regex with stringi/ICU: why is a ‘+’ considered a non-[:punct:] character?

POSIX character classes need to be wrapped inside of a character class, the correct form would be [[:punct:]]. Do not confuse the POSIX term “character class” with what is normally called a regex character class. This POSIX named class in the ASCII range matches all non-controls, non-alphanumeric, non-space characters. ascii <- rawToChar(as.raw(0:127), multiple=T) paste(ascii[grepl(‘[[:punct:]]’, ascii)], … Read more