Inspired by Anthony’s comment, I found out that you can specify the number of threads that the parallel
library uses by default (specify it before you call the NgramTokenizer
):
# Sets the default number of threads to use
options(mc.cores=1)
Since the NGramTokenizer
seems to hang on the parallel::mclapply
call, changing the number of threads seems to work around it.