A Trie , (also known as a prefix tree) is a special type of tree used to store associative data structures

A trie (pronounced try) gets its name from retrieval — its structure makes it a stellar matching algorithm.

Context

Write your own shuffle method to randomly shuffle characters in a string. Use the words text file, located at /usr/share/dict/words, and your shuffle method to create an anagram generator that only produces real words. Given a string as a command line argument, print one of its anagrams.

I was presented with this challenge this week at Make School’s Product Academy.

The words in the text file are separated by new lines. Its formatting makes it a lot easier to put the words into a data structure. For now, I’m storing them in a list — each element being a single word from the file.

One approach to this challenge is to:

randomly shuffle the characters in the string

then, check it against all words that were in /usr/share/dict/words to verify that it’s a real word.

However, this approach requires that I check that the randomly shuffled characters in the new string matches one of 235,887 words in that file — that means 235,887 operations for each string that I want to verify as a real word.

This was an unacceptable solution for me. I first looked up libraries that had already been implemented to check if words exist in a language, and found pyenchant. I first completed the challenge using the library, in a few lines of code.

def generateAnagram(string, language="en_US"):

languageDict = enchant.Dict(language)

numOfPossibleCombinationsForString = math.factorial(len(string))

for i in range(0, numOfPossibleCombinationsForString):

wordWithShuffledCharacters = shuffleCharactersOf(string) if languageDict.check(wordWithShuffledCharacters):

return wordWithShuffledCharacters



return "There is no anagram in %s for %s." % (language, string)

Using a couple of library functions in my code was a quick and easy solution. However, I didn’t learn much by finding a library to solve the problem for me.

I was positive that the library wasn’t using the approach I mentioned earlier. I was curious and dug through the source code — I found a trie.

Trie

A trie stores data in “steps”. Each step is a node in the trie.

Storing words is a perfect use case for this kind of tree, since there are a finite amount of letters that can be put together to make a string.

Each step, or node, in a language trie will represent one letter of a word. The steps begin to branch off when the order of the letters diverge from the other words in the trie, or when a word ends.

I created a trie out of directories on my Desktop to visualize stepping down through nodes. This is a trie that contains two words: apple and app.