Extract Word from Synset using Wordnet in NLTK 3.0

WordNet works fine in NLTK 3.0. You are just accessing the lemmas (and names) in the wrong way. Try this instead: >>> import nltk >>> nltk.__version__ ‘3.0.0’ >>> from nltk.corpus import wordnet as wn >>> for synset in wn.synsets(‘dog’): for lemma in synset.lemmas(): print lemma.name() dog domestic_dog Canis_familiaris frump dog dog cad bounder blackguard … … Read more

How to get synonyms from nltk WordNet Python

If you want the synonyms in the synset (aka the lemmas that make up the set), you can get them with lemma_names(): >>> for ss in wn.synsets(‘small’): >>> print(ss.name(), ss.lemma_names()) small.n.01 [‘small’] small.n.02 [‘small’] small.a.01 [‘small’, ‘little’] minor.s.10 [‘minor’, ‘modest’, ‘small’, ‘small-scale’, ‘pocket-size’, ‘pocket-sized’] little.s.03 [‘little’, ‘small’] small.s.04 [‘small’] humble.s.01 [‘humble’, ‘low’, ‘lowly’, ‘modest’, ‘small’] … Read more

Using NLTK and WordNet; how do I convert simple tense verb into its present, past or past participle form?

With the help of NLTK this can also be done. It can give the base form of the verb. But not the exact tense, but it still can be useful. Try the following code. from nltk.stem.wordnet import WordNetLemmatizer words = [‘gave’,’went’,’going’,’dating’] for word in words: print word+”–>”+WordNetLemmatizer().lemmatize(word,’v’) The output is: gave–>give went–>go going–>go dating–>date Have … Read more

All synonyms for word in python? [duplicate]

Using wn.synset(‘dog.n.1′).lemma_names is the correct way to access the synonyms of a sense. It’s because a word has many senses and it’s more appropriate to list synonyms of a particular meaning/sense. To enumerate words with similar meanings, possibly you can also look at the hyponyms. Sadly, the size of Wordnet is very limited so there … Read more

wordnet lemmatization and pos tagging in python

First of all, you can use nltk.pos_tag() directly without training it. The function will load a pretrained tagger from a file. You can see the file name with nltk.tag._POS_TAGGER: nltk.tag._POS_TAGGER >>> ‘taggers/maxent_treebank_pos_tagger/english.pickle’ As it was trained with the Treebank corpus, it also uses the Treebank tag set. The following function would map the treebank tags … Read more

Convert words between verb/noun/adjective forms

This is more a heuristic approach. I have just coded it so appologies for the style. It uses the derivationally_related_forms() from wordnet. I have implemented nounify. I guess verbify works analogous. From what I’ve tested works pretty well: from nltk.corpus import wordnet as wn def nounify(verb_word): “”” Transform a verb to the closest noun: die … Read more

Stemmers vs Lemmatizers

Q1: “[..] are English stemmers any useful at all today? Since we have a plethora of lemmatization tools for English” Yes. Stemmers are much simpler, smaller and usually faster than lemmatizers, and for many applications their results are good enough. Using a lemmatizer for that is a waste of resources. Consider, for example, dimensionality reduction … Read more

How to check if a word is an English word with Python?

For (much) more power and flexibility, use a dedicated spellchecking library like PyEnchant. There’s a tutorial, or you could just dive straight in: >>> import enchant >>> d = enchant.Dict(“en_US”) >>> d.check(“Hello”) True >>> d.check(“Helo”) False >>> d.suggest(“Helo”) [‘He lo’, ‘He-lo’, ‘Hello’, ‘Helot’, ‘Help’, ‘Halo’, ‘Hell’, ‘Held’, ‘Helm’, ‘Hero’, “He’ll”] >>> PyEnchant comes with a … Read more