— Using Kruskal's algorithm to generate random spanning trees—or mazes — 5-minute read

For the third article in my series on maze algorithms, I’m going to take a look at Kruskal’s algorithm. (I’ve previously covered recursive backtracking and Eller’s algorithm.)

Kruskal’s algorithm is a method for producing a minimal spanning tree from a weighted graph. The algorithm I’ll cover here is actually a randomized version of Kruskal’s; the original works something like this:

Throw all of the edges in the graph into a big burlap sack. (Or, you know, a set or something.) Pull out the edge with the lowest weight. If the edge connects two disjoint trees, join the trees. Otherwise, throw that edge away. Repeat until there are no more edges left.

The randomized algorithm just changes the second step, so that instead of pulling out the edge with the lowest weight, you remove an edge from the bag at random. Making that change, the algorithm now produces a fairly convincing maze.

Let’s walk through an example manually, to see how the process works in practice.

An example

For this example, I’ll use a simple 3×3 grid. I’ve assigned each cell a letter, indicating which set it belongs to.

A B C D E F G H I

The algorithm is straightforward: simply select an edge at random, and join the cells it connects if they are not already connected by a path. We can know if the cells are already connected if they are in the same set. So, let’s choose the edge between (2,2) and (2,3). The cells are in different sets, so we join the two into a single set and connect the cells:

A B C D E F G E I

Let’s do a few more passes of the algorithm, to get to the interesting part:

A A C D E F G E I A A C D E C G E I A A C D E C G E E A A C D E C E E E

Here’s where things start to move fast. Note what happens when the edge between (2,1) and (2,2) is pulled from the bag:

A A C D A C A A A

The two trees, A and E, were joined into one set, A, implying that any cell in A is reachable from any other cell in A. Let’s try joining (1,2) and (1,3) now:

A A C A A C A A A

Now, consider the edges (1,1)–(1,2) and (1,2)–(2,2). Neither of these has been drawn from the bag yet. What would happen if one of them were? Well, in both cases, the cells on either side of the edge belong to the same set. Connecting the cells in either case would result in a cycle, so we discard the edge and try again.

After a one more pass, we’ll have:

A A A A A A A A A

The algorithm finishes when there are no more edges to consider (which, in this case, is when there is only a single set left). And the result is a perfect maze!

Implementation

Implementing Kruskal’s algorithm is straightforward, but for best results you need to find a very efficient way to join sets. If you do it like I illustrated above, assigning a set identifier to each cell, you’ll need to iterate on every merge, which will be expensive. Using trees to represent the sets is much faster, allowing you to merge sets efficiently simply by adding one tree as a subtree of the other. Testing whether two cells share a set is done by comparing the roots of their corresponding trees.

Once you have the tree data structure, the algorithm is extremely straightforward. Begin by initializing the grid (which will represent the maze itself), and the sets (one per cell):

1 2 grid = Array .new(height) { Array .new(width, 0 ) } sets = Array .new(height) { Array .new(width) { Tree .new } }

Note that it would probably be more efficient to join the two representations, but I’ve split them apart for clarity.

Then, build the list of edges. Here I’m representing each edge as one of its end-points, and a direction:

1 2 3 4 5 6 7 edges = [] height.times do |y| width.times do |x| edges << [x, y, N ] if y > 0 edges << [x, y, W ] if x > 0 end end

Once you have the list of edges, just sort them randomly:

edges = edges.sort_by{rand}

The algorithm itself, then, is simply a process of looping until the set of egdes is empty:

1 2 3 until edges.empty? # ... end

Within the loop, we remove the next edge from the list, compute the other end point, and test their two sets:

1 2 3 4 5 6 7 x, y, direction = edges.pop nx, ny = x + DX [direction], y + DY [direction] set1, set2 = sets[y][x], sets[ny][nx] unless set1.connected?(set2) # join the sets and connect the cells end

The joining and connecting bit is pretty straightforward:

1 2 3 set1.connect(set2) @grid [y][x] |= direction @grid [ny][nx] |= OPPOSITE [direction]

And you’re done! For those of you not using IE (which will make a total mess of this), here are two demos you can play with to see the algorithm in action:

My complete implementation (in Ruby) is here:

Conclusion

Kruskal’s is a fun algorithm to implement and watch, but I’m not partial to the style of mazes it generates. It tends to create a lot of short dead-ends, which is (admittedly in my own opinion) not necessarily very esthetically attractive.

One area where Kruskal’s works better than an algorithm like the recursive backtracker, is when you’re dealing with a maze with two or more disjoint areas, like if you were doing a maze that was constrained to the shape of two or more letters. Essentially, this is the same as multiple different mazes, but with Kruskal’s you could do them all at once, since you’re only dealing with edges and not with direct connectivity.

Please give Kruskal’s a try and share your implementation! Look for ways to tweak the algorithm to produce different results. Have fun!