4.6. Implementing Kruskal's Algorithm: The Union-Find Data Structure


  • The basic idea behind the algorithm is that the graph in question has a fixed population of nodes
  • And the graphs grows over time by inserting edges between certain pairs of nodes.
  • Goal: Maintain a set of connected components of the graph being processed throughout the whole operation
  • With this algorithm, we develop a data structure called the Union-Find structure
  • The Union-Find data structure stores a representation of the components in a way that supports fast searching and updating
  • This data structure is used to efficiently implement Kruskal's Algorithm
As each edge e = (u,w) is considered
Find the connected components containing v and w.

If these components are different:
Then there is no path from v to w, so e should be added to the structure

Elif they're the same:

Then there's a path from v to w, so e should be omitted.


The Problem

  • Operations:
    • Find(u) : returns name of set containing u
      • It can be used to find if two nodes u and v are in the same set by simply checking if Find(u) = Finf(v)
      • Goal implementation: O(log n)
    • Union(A, B) : merges sets A and B into one set
      • Goal implementation: O(log n)
    • MakeUnionFind(S) : returns the Union-Find data structure on set S where all elements are in separate sets.
      • Example of such a set: A connected component with no edges
      • Goal Implementation: O(n), n = |S|
  • The sets obtained will be the connected components of the graph
  • So, for a given node u, Find(u) returns the name of the component containing u
  • To add an edge (u,v) we first test if Find(u) = Find(v)
  • If Find(u) ≠ Find(v), then Union(Find(u), Find(v)) merges the two components into one
  • The Union-Find data structure maintains components of a graph only when we add edges, not when we delete them
  • The “name of the set” returned by Union-Find will be arbitrary


A simple Data Structure for Union-Find

»»»»»»»»> Maintain an Array Component of size n that contains the name of the set currently containing each element

Let S = {1,…,n}

Component[s] = the name of set containing s

To implement MakeUnionFind(S) :
Set up an array and initialize it to Component[s] = s for all s in S

With this operation, Find(v) is done in O(1) time

But Union(A,B) can take O(n) time with this operation(–>We have to update the values of Component[s] for all elements in A and B)

To improve on this bound of Union(A,B):

Maintain the list of elements in each set

Choose the name for the union to be the name of one of the sets, so we only update Component[s] for s in the other set

If the other set is big, name the union to be the name of that set and update Component[s] for s in the smaller set

Improvements:
Maintain an Array Size of length n

Size[A] = the size of A

So when a Union(A,B) operation is needed, we use the name of the larger set for the union and update Component for only the smaller set

However, we still have a O(n) time even after the improvements

So, overall, Find(u) takes O(1) time,MakeUnionFind(A,B) takes O(n) time and any sequence of k Union operations takes at most O(klogk) time with this simple implementation of the algorithm.


A Better Implementation for Union-Find


  • Instead of using an array, we now use a pointer-based implementation
  • With this implementation:
We maintain a set S of size n

Unions keep the name of the largest set

A Union operation takes O(1) time

MakeUnionFind(S) takes O(n) time

Find(u) takes O(logn) time

Implementing Kruskal's Algorithm

T = {}
foreach (u ∈ V) make a set containing singleton u
for i = 1 to m
(u,v) = ei
if (u and v are in different sets)
T = T ∪ {ei}
merge the sets containing u and v

return T

( The above algorithm, is from the class discussion)

Kruskal's algorithm implemented as above runs in O(mlogn) time.

This section was also interesting, although it contained some long proofs. I give it an 8/10

courses/cs211/winter2012/journals/jeanpaul/chapterfour_sectionvi.txt · Last modified: 2012/03/01 03:54 by mugabej
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0