A prefix tree (also known as a trie) is a data structure that helps you organize a word list & quickly find words that start with a specific prefix.

For example, you can find all the words in your dictionary that start with the letters “ca”, such as “cat” or “cape”.

Look at this picture:

This is a prefix tree.

You can follow from the root ( * ) to a marked node (like e and t ) to find words.

In this article you will learn how to implement your own prefix tree in Ruby & how to use it to solve problems!

Prefix Tree Implementation

To implement this in Ruby I decided to use a Node class with a few attributes:

The value (one character)

The “word” variable. A true / false value which tells you if this is a valid word

The “next” array. This stores all the characters (as Node objects) that come after this one in the tree

Here’s the code:

class Node attr_reader :value, :next attr_accessor :word def initialize(value) @value = value @word = false @next = [] end end

Now we need a class to hold the root node & the methods for working with these nodes.

Let’s look at the Trie class:

class Trie def initialize @root = Node.new("*") end end

Inside this class we have the following methods:

def add_word(word) letters = word.chars base = @root letters.each { |letter| base = add_character(letter, base.next) } base.word = true end def find_word(word) letters = word.chars base = @root word_found = letters.all? { |letter| base = find_character(letter, base.next) } yield word_found, base if block_given? base end

Both methods break down a given word into an array of characters (using the chars method).

Then we navigate the tree starting at the root & either find a character or add it.

Here are the supporting methods (also inside the Trie class):

def add_character(character, trie) trie.find { |n| n.value == character } || add_node(character, trie) end def find_character(character, trie) trie.find { |n| n.value == character } end def add_node(character, trie) Node.new(character).tap { |new_node| trie << new_node } end

To add a character we check if it already exists (using the find method). If it does then we return the node.

Otherwise we create it & return the new node.

Then we also have the include? method:

def include?(word) find_word(word) { |found, base| return found && base.word } end

Now we are ready to start using our new data structure and see what we can do with it 🙂

Trie Uses & Examples

Let's start by adding some words to our tree:

trie = Trie.new trie.add_word("cat") trie.add_word("cap") trie.add_word("cape") trie.add_word("camp")

You can check if a word is included in the tree like this:

p trie.include?("cape") # true p trie.include?("ca") # false

So what are some uses for this data structure?

Solving word games

Spell checking

Autocomplete

You will need a good dictionary to load into your tree.

I found these which may be useful to you:

Finding Prefixed Words

So in the code example I showed you before we have implemented the add & find operations.

But we also want a find_words_starting_with method.

We can do this using the "Depth First Search" (DFS) algorithm. We also need a way to keep track of the word we are looking at.

Remember that our nodes only have one character each, so we need to reconstruct the actual strings by walking over the tree.

Here's a method that does all of that:

def find_words_starting_with(prefix) stack = [] words = [] prefix_stack = [] stack << find_word(prefix) prefix_stack << prefix.chars.take(prefix.size-1) return [] unless stack.first until stack.empty? node = stack.pop prefix_stack.pop and next if node == :guard_node prefix_stack << node.value stack << :guard_node words << prefix_stack.join if node.word node.next.each { |n| stack << n } end words end

We use two stacks here, one for keeping track of unvisited nodes ( stack ) & another to keep track of the current string ( prefix_stack ).

We loop until we have visited all the nodes & add the value of the node to the prefix_stack . Each node holds the value for only one character, so we need to collect these characters to form a word.

The :guard_node symbol is added to the prefix_stack so we know when we are backtracking. We need this to remove characters from our string buffer ( prefix_stack ) at the right time.

Then if node.word is true it means we found a full word & we add it to our list of words.

Example of using this method:

t.find_words_starting_with("cap") # ["cap", "cape"]

If no words are found you will get an empty array:

t.find_words_starting_with("b") # []

This method can be used to implement an autocomplete feature.

Summary

You learned about prefix trees (also known as tries), a data structure used to organize a list of words into a tree. You can quickly query this tree to check if a word is valid & to find words that share the same prefix.

Don't forget to share this post so more people can learn!