TreeTagger installation successful but cannot open .par file

I think there are two problems: first, the scripts should have “-utf8” in their name, e.g. cmd/tagger-chunker-german-utf8, because you downloaded the UTF-8 data. Second, tagging and chunking requires a data file each. See the homepage which has a section “Parameter files for PC” and “Chunker parameter files for PC” – download the files from both … Read more

Stemmers vs Lemmatizers

Q1: “[..] are English stemmers any useful at all today? Since we have a plethora of lemmatization tools for English” Yes. Stemmers are much simpler, smaller and usually faster than lemmatizers, and for many applications their results are good enough. Using a lemmatizer for that is a waste of resources. Consider, for example, dimensionality reduction … Read more