How can I easily draw a parse tree from Stanford parsing data in python?
Question:
So I have this Stanford-style parsing of an english sentence:
"There is a tree behind a car"
Parse: [S [NP There_EX NP] [VP is_VBZ [NP [NP a_DT tree_NN NP] [PP behind_IN [NP a_DT car_NN NP] PP] NP] VP] S]
I want to use some of the tree drawing methods in python to draw a parsing tree from the data.
Is there an easy way to use that parsing representation to draw a tree with python or should I change the representation somehow?
Answers:
Convert the parse into a representation that is understandable by graphviz. Then pass that representation to graphviz. There’s also an interefacing library called pygraphviz.
NLTK has a tree
module. You can use it to parse the representation you get out of Stanford (see this related question). Then you can use nltk.tree.draw
to display it.
To draw the parsing representation of the tree (and export its resulting visualization), you can alternatively use the Constituent Treelib library (pip install constituent-treelib
).
Here are the necessary steps:
from constituent_treelib import ConstituentTree
# First, define the sentence that should be parsed
sentence = "There is a tree behind a car"
# Define the considered language with respect to the underlying spaCy and benepar models
language = ConstituentTree.Language.English
# Specify the desired model for this language ("Small" is selected by default)
spacy_model_size = ConstituentTree.SpacyModelSize.Medium
# Create the NLP pipeline required to instantiate a ConstituentTree object
nlp = ConstituentTree.create_pipeline(language, spacy_model_size)
# If you wish, instruct the library to download and install the models automatically
# nlp = ConstituentTree.create_pipeline(language, spacy_model_size, download_models=True)
# Instantiate a ConstituentTree object and pass it the parsed sentence as well as the NLP pipeline
tree = ConstituentTree(sentence, nlp)
# Finally, draw the tree and export its visualization to a PDF file
tree.export_tree("my_tree.pdf")
And here is the result…
So I have this Stanford-style parsing of an english sentence:
"There is a tree behind a car"
Parse: [S [NP There_EX NP] [VP is_VBZ [NP [NP a_DT tree_NN NP] [PP behind_IN [NP a_DT car_NN NP] PP] NP] VP] S]
I want to use some of the tree drawing methods in python to draw a parsing tree from the data.
Is there an easy way to use that parsing representation to draw a tree with python or should I change the representation somehow?
Convert the parse into a representation that is understandable by graphviz. Then pass that representation to graphviz. There’s also an interefacing library called pygraphviz.
NLTK has a tree
module. You can use it to parse the representation you get out of Stanford (see this related question). Then you can use nltk.tree.draw
to display it.
To draw the parsing representation of the tree (and export its resulting visualization), you can alternatively use the Constituent Treelib library (pip install constituent-treelib
).
Here are the necessary steps:
from constituent_treelib import ConstituentTree
# First, define the sentence that should be parsed
sentence = "There is a tree behind a car"
# Define the considered language with respect to the underlying spaCy and benepar models
language = ConstituentTree.Language.English
# Specify the desired model for this language ("Small" is selected by default)
spacy_model_size = ConstituentTree.SpacyModelSize.Medium
# Create the NLP pipeline required to instantiate a ConstituentTree object
nlp = ConstituentTree.create_pipeline(language, spacy_model_size)
# If you wish, instruct the library to download and install the models automatically
# nlp = ConstituentTree.create_pipeline(language, spacy_model_size, download_models=True)
# Instantiate a ConstituentTree object and pass it the parsed sentence as well as the NLP pipeline
tree = ConstituentTree(sentence, nlp)
# Finally, draw the tree and export its visualization to a PDF file
tree.export_tree("my_tree.pdf")
And here is the result…