★ Union-Find Contender 2: Quick-Union

OK, moving on. Our next contender, quick-union, is known as a lazy approach to solving the union-find problem.

In programming, quick find is represented by two arrays:

— The first array is the set of items in the graph (size N).

— The second array is the a set of integer id’s for each item in the graph (also size N).

The analogy for quick-union is a forest of trees. Each sub-set is a tree, containing nodes that point to a single root.

The id[] value of a tree node contains a link to the id[] value of another item. This, in turn points to the id[] value of another, and so on. Eventually when following the trail of linked id[] values in this way, we encounter an item with an id[] value that points to itself. This is the ROOT.

Two items are considered to be in the same component if and only if this process leads them to the same root.

The union(p, q) method will therefore follow the links to find the roots of p and q, and thereafter change one of the components by linking one of these roots to the other. The choice of which component to rename is arbitrary.

In summary, the interpretation of quick union is that id[i] is parent of i.

i 0 1 2 3 4 5 6 7 8 9

id[i] 2 2 2 9 0 3 4 2 9 9

find(p, q) therefore checks if p and q have the same root.

union(p, q) sets the id of q’s root to the id of p’s root. Connecting 2 with 9 therefore results in the following union command:

i 0 1 2 3 4 5 6 7 8 9

id[i] 2 2 9 9 0 3 4 2 9 9

This is why it’s called quick-union: only one value changes!

Looking at quick-union it is clear that it is more efficient than quick-find, because the union operation is potentially much faster.

However, there is the possibility that our trees can become very tall. This is because we could end up with a large number of items all being in the same subset (visually, this would be displayed as long, tall trees). In this scenario, our find() operation could take linear or quadratic time, since we’d need to travel a long way to determine the common root of two nodes.

In this example, to determine if 5 and 7 are connected, our find operation required six steps, because we’ve managed to generate long, tall trees with our quick-union algorithm. When the number of nodes become connected in this way, it can result in unacceptably slow performance of the find operation.

Our conclusion is that Quick-union is also too slow.

So, a summary of where we are at: Quick-find has flat trees, but it’s too expensive to keep them flat, because union takes N steps. Quick-union is better but has a defect: trees can get tall, meaning find is too expensive, taking up to N steps. For huge datasets, neither would scale efficiently.