Sometimes, instead of working, I like to see what search terms are bringing readers to my blog. The most common search that healthyalgorithms has been most useless for is “minimum spanning tree python”. Today, I’ll remedy that.

But first, dear searchers, consider this: why are you searching for minimum spanning tree code in python? Is it because you have a programming assignment due soon? High-school CS class is voluntary. All college is optional, and many you are paying to attend. You know what I’m talking about? Perhaps the short motivational comic Time Management for Anarchists is better than some Python code.

Still want to know how to do it? Ok, but I warned you.

I wrote about what a spanning tree is and why you might want one a few months ago, while promoting my wares. But forget all that fancy stuff. If you need to find a plain-old minimum spanning tree, and you like speaking Python, then you want MinimumSpanningTree.py from David Eppstein’s PADS library (Python Algorithms and Datastructures).

PADS doesn’t have an easy_install package that I know of, but for finding MSTs, there are only two files you need: UnionFind.py and MinimumSpanningTree.py. Put these somewhere that Python can find them, like in your working directory.

Python makes Kruskal’s algorithm so short that I’ll just quote Eppstein’s entire MinimumSpanningTree function here:

def MinimumSpanningTree(G): """ Return the minimum spanning tree of an undirected graph G. G should be represented in such a way that G[u][v] gives the length of edge u,v, and G[u][v] should always equal G[v][u]. The tree is returned as a list of edges. """ # Kruskal's algorithm: sort edges by weight, and add them one at a time. # We use Kruskal's algorithm, first because it is very simple to # implement once UnionFind exists, and second, because the only slow # part (the sort) is sped up by being built in to Python. subtrees = UnionFind() tree = [] edges = [(G[u][v],u,v) for u in G for v in G[u]] edges.sort() for W,u,v in edges: if subtrees[u] != subtrees[v]: tree.append((u,v)) subtrees.union(u,v) return tree

So, for example, if you have ever had a desire to find the minimum spanning tree of complete graph with uniformly random edge weights, you could do it like this:

from random import random from MinimumSpanningTree import MinimumSpanningTree as mst n = 10 G = {} for u in range(n): G[u] = {} for u in range(n): for v in range(u): r = random() G[u][v] = r G[v][u] = r T = mst(G) mst_weight = sum([G[u][v] for u,v in T])

We might as well get some beautiful pictures out of this, since it’s not much more work. For the above code, but tweaked so that every point has a random position in the unit square with distances as-the-crow-flies between them, behold.



For fun times, ask yourself, what if I wanted 2 disjoint spanning trees on this set of points? The minimum cost solution can be very different from the spanning trees you find if you yank out the MST and use Eppstein’s code on the remaining edges.

p.s. It looks like Aric Hagberg just added this mst code to NetworkX, so if you have the most-most-most recent version of that, maybe you can build up any XGraph and then just say T = networkx.algorithms.mst(G) .