bigrams instead of single words in termdocument matrix using R and Rweka

Inspired by Anthony’s comment, I found out that you can specify the number of threads that the parallel library uses by default (specify it before you call the NgramTokenizer):

# Sets the default number of threads to use
options(mc.cores=1)

Since the NGramTokenizer seems to hang on the parallel::mclapply call, changing the number of threads seems to work around it.

More Related Contents:

How can I find compound words, removing spaces between them and replace them in my corpus?
Editing legend (text) labels in ggplot
Read a text file in R line by line
R-Project no applicable method for ‘meta’ applied to an object of class “character”
Why and where are \n newline characters getting introduced to c()?
Convert written number to number in R
Finding 2 & 3 word Phrases Using R TM Package
R tm package invalid input in ‘utf8towcs’
agrep: only return best match(es)
Detect text language in R
How to make geom_text plot within the canvas’s bounds
Recognize PDF table using R
Emoticons in Twitter Sentiment Analysis in r
list of word frequencies using R
How do I scrape / automatically download PDF files from a document search web interface in R?
Text-mining with the tm-package – word stemming
R text file and text mining…how to load data
Get the difference between dates in terms of weeks, months, quarters, and years
Subscripts in plots in R
Suppress output of a function
dplyr filter: Get rows with minimum of variable, but only the first if multiple minima
How to convert dataframe into time series?
Plot over multiple pages
How to generate a matrix of combinations
Removing display of row names from data frame
Writing robust R code: namespaces, masking and using the `::` operator
How to change order of array dimensions
Get column index from label in a data frame
Combined plot of ggplot2 (Not in a single Plot), using par() or layout() function? [duplicate]
Is there a way to output text to the R console in color

More Related Contents:

Leave a Comment Cancel reply