The problem

We want to maintain a set of disjoint sets, in which elements are fixed. Only union is supported, while spliting is not. Sepcifically, we support following three operations.

  • MakeSet(u)
  • Find(u)
  • Union(u,v)

Let Disjoint Set Data Structure: Union-Find - 图1 donote the fixed set of elements, whose size is Disjoint Set Data Structure: Union-Find - 图2. Generally, we want to estimate the total time of a sequence of Disjoint Set Data Structure: Union-Find - 图3 operations, in which the first Disjoint Set Data Structure: Union-Find - 图4 operation is MakeSet for each element.

Linked-list

We can represent each set by a linked list, with each element additionally pointing to its head.
The worst-case time for Union is Disjoint Set Data Structure: Union-Find - 图5#card=math&code=O%28n%29), and we can find a sequence of Disjoint Set Data Structure: Union-Find - 图6 operations which needs Disjoint Set Data Structure: Union-Find - 图7#card=math&code=O%28n%5E2%29) time in total.

a weighted-union heuristic

We can improve it by a little trick. In the Union operation, we can rename the small set (say B) to the larger set (say A), i.e. make all the elements of B point to the head of A. The worst-case time of Union is stilll Disjoint Set Data Structure: Union-Find - 图8#card=math&code=O%28n%29), while its amortized time becomes Disjoint Set Data Structure: Union-Find - 图9#card=math&code=O%28%5Clog%20n%29).

worst-case amortized
MakeSet Disjoint Set Data Structure: Union-Find - 图10#card=math&code=O%281%29) Disjoint Set Data Structure: Union-Find - 图11#card=math&code=O%281%29)
Find Disjoint Set Data Structure: Union-Find - 图12#card=math&code=O%281%29) Disjoint Set Data Structure: Union-Find - 图13#card=math&code=O%281%29)
Union Disjoint Set Data Structure: Union-Find - 图14#card=math&code=O%28n%29) Disjoint Set Data Structure: Union-Find - 图15#card=math&code=O%28%5Clog%20n%29)
Total Disjoint Set Data Structure: Union-Find - 图16#card=math&code=O%28m%5Clog%20n%29) -

Forest

With rooted trees, Union costs Disjoint Set Data Structure: Union-Find - 图17#card=math&code=O%281%29), while Find may cost O(n) in worst-case. It can also be improved to Disjoint Set Data Structure: Union-Find - 图18#card=math&code=O%28%5Clog%20n%29) using the same weighted-union heuristic: Union-by-rank. Actually, Union-by-size works as well. The total time is also Disjoint Set Data Structure: Union-Find - 图19#card=math&code=O%28m%5Clog%20n%29)
Using another heuristic “path compression”, we can amortize the time of Find and Union to Disjoint Set Data Structure: Union-Find - 图20)#card=math&code=O%28%5Calpha%28n%29%29), which results Disjoint Set Data Structure: Union-Find - 图21)#card=math&code=O%28m%5Calpha%28n%29%29) time in total.

no heuristic weighted-union both(amortized)
MakeSet Disjoint Set Data Structure: Union-Find - 图22#card=math&code=O%281%29) Disjoint Set Data Structure: Union-Find - 图23#card=math&code=O%281%29) Disjoint Set Data Structure: Union-Find - 图24#card=math&code=O%281%29)
Find Disjoint Set Data Structure: Union-Find - 图25#card=math&code=O%28n%29) Disjoint Set Data Structure: Union-Find - 图26#card=math&code=O%28%5Clog%20n%29) Disjoint Set Data Structure: Union-Find - 图27)#card=math&code=O%28%5Calpha%28n%29%29)
Union Disjoint Set Data Structure: Union-Find - 图28#card=math&code=O%281%29) Disjoint Set Data Structure: Union-Find - 图29#card=math&code=O%281%29) Disjoint Set Data Structure: Union-Find - 图30)#card=math&code=O%28%5Calpha%28n%29%29)
Total Disjoint Set Data Structure: Union-Find - 图31#card=math&code=O%28mn%29) Disjoint Set Data Structure: Union-Find - 图32#card=math&code=O%28m%5Clog%20n%29) Disjoint Set Data Structure: Union-Find - 图33)#card=math&code=O%28m%5Calpha%28n%29%29)