How do I do dependency parsing in NLTK?

We can use Stanford Parser from NLTK.

Requirements

You need to download two things from their website:

  1. The Stanford CoreNLP parser.
  2. Language model for your desired language (e.g. english language model)

Warning!

Make sure that your language model version matches your Stanford CoreNLP parser version!

The current CoreNLP version as of May 22, 2018 is 3.9.1.

After downloading the two files, extract the zip file anywhere you like.

Python Code

Next, load the model and use it through NLTK

from nltk.parse.stanford import StanfordDependencyParser

path_to_jar="path_to/stanford-parser-full-2014-08-27/stanford-parser.jar"
path_to_models_jar="path_to/stanford-parser-full-2014-08-27/stanford-parser-3.4.1-models.jar"

dependency_parser = StanfordDependencyParser(path_to_jar=path_to_jar, path_to_models_jar=path_to_models_jar)

result = dependency_parser.raw_parse('I shot an elephant in my sleep')
dep = result.next()

list(dep.triples())

Output

The output of the last line is:

[((u'shot', u'VBD'), u'nsubj', (u'I', u'PRP')),
 ((u'shot', u'VBD'), u'dobj', (u'elephant', u'NN')),
 ((u'elephant', u'NN'), u'det', (u'an', u'DT')),
 ((u'shot', u'VBD'), u'prep', (u'in', u'IN')),
 ((u'in', u'IN'), u'pobj', (u'sleep', u'NN')),
 ((u'sleep', u'NN'), u'poss', (u'my', u'PRP$'))]

I think this is what you want.

Leave a Comment