Anyone know of some good Word Sense Disambiguation software? [closed]

My list are not exhaustive but surely Googling for more will be better for your purposes. For softwares here’s a short list, remember to CITE the relevant sources!!! GWSD: Unsupervised Graph-based Word Sense Disambiguation http://lit.csci.unt.edu/~rada/downloads/GWSD/GWSD.1.0.tar.gz SenseLearner: All-Words Word Sense Disambiguation Tool http://lit.csci.unt.edu/~rada/downloads/senselearner/SenseLearner2.0.tar.gz KYOTO UKB graph-based WSD http://ixa2.si.ehu.es/ukb/ pyWSD: Python Implementation of Simple WSD algorithms https://github.com/alvations/pywsd … Read more

TreeTagger installation successful but cannot open .par file

I think there are two problems: first, the scripts should have “-utf8” in their name, e.g. cmd/tagger-chunker-german-utf8, because you downloaded the UTF-8 data. Second, tagging and chunking requires a data file each. See the homepage which has a section “Parameter files for PC” and “Chunker parameter files for PC” – download the files from both … Read more

Saving nltk drawn parse tree to image file

Using the nltk.draw.tree.TreeView object to create the canvas frame automatically: >>> from nltk.tree import Tree >>> from nltk.draw.tree import TreeView >>> t = Tree.fromstring(‘(S (NP this tree) (VP (V is) (AdjP pretty)))’) >>> TreeView(t)._cframe.print_to_file(‘output.ps’) Then: >>> import os >>> os.system(‘convert output.ps output.png’) [output.png]:

LDA model generates different topics everytime i train on the same corpus

Why does the same LDA parameters and corpus generate different topics everytime? Because LDA uses randomness in both training and inference steps. And how do i stabilize the topic generation? By resetting the numpy.random seed to the same value every time a model is trained or inference is performed, with numpy.random.seed: SOME_FIXED_SEED = 42 # … Read more

Difference between Python’s collections.Counter and nltk.probability.FreqDist

nltk.probability.FreqDist is a subclass of collections.Counter. From the docs: A frequency distribution for the outcomes of an experiment. A frequency distribution records the number of times each outcome of an experiment has occurred. For example, a frequency distribution could be used to record the frequency of each word type in a document. Formally, a frequency … Read more