The WordNet database contains all sorts of interesting relationships between words: it can categorize words into hierarchies, find the parts of an object, and answer many other interesting questions.

The code below relies on the NLTK and NetworkX libraries for Python.

Categorizing words

What, exactly, is a dog? It's a domestic animal and a carnivore, not to mention a physical entity (as opposed to an abstract entity, such as an idea). WordNet knows all these facts:

How do we generate this image? First, we look up the first entry for "dog" in WordNet. This returns a "synset", or a set of words with equivalent meanings.

dog = wn . synset ( 'dog.n.01' )

Next, we compute the transitive closure of the hypernym relationship, or (in English) we look for all the categories to which "dog" belongs, and all the categories to which those categories belong, recursively:

graph = closure_graph ( dog , lambda s : s . hypernyms ())

After that, we just pass the resulting graph to NetworkX for display:

nx . draw_graphviz ( graph )

The implementation

The closure_graph function repeatedly calls fn on the supplied symset, and uses the result to build a NetworkX graph. This code goes at the top of the file, so you can use wn and nx in your own code.

from nltk.corpus import wordnet as wn import networkx as nx def closure_graph ( synset , fn ): seen = set () graph = nx . DiGraph () def recurse ( s ): if not s in seen : seen . add ( s ) graph . add_node ( s . name ) for s1 in fn ( s ): graph . add_node ( s1 . name ) graph . add_edge ( s . name , s1 . name ) recurse ( s1 ) recurse ( synset ) return graph

By using a high-quality graph library, we make it much easier to merge, analyze and display our graphs.

More graphs

Parts of the finger, generated with synset('finger.n.01') and part_meronyms :