Union-Find in Linux¶
- Date:
June 21, 2024
- Author:
Xavier <xavier_qy@163.com>
What is union-find, and what is it used for?¶
Union-find is a data structure used to handle the merging and queryingof disjoint sets. The primary operations supported by union-find are:
- Initialization: Resetting each element as an individual set, with
each set’s initial parent node pointing to itself.
- Find: Determine which set a particular element belongs to, usually by
returning a “representative element” of that set. This operationis used to check if two elements are in the same set.
Union: Merge two sets into one.
As a data structure used to maintain sets (groups), union-find is commonlyutilized to solve problems related to offline queries, dynamic connectivity,and graph theory. It is also a key component in Kruskal’s algorithm forcomputing the minimum spanning tree, which is crucial in scenarios likenetwork routing. Consequently, union-find is widely referenced. Additionally,union-find has applications in symbolic computation, register allocation,and more.
Space Complexity: O(n), where n is the number of nodes.
Time Complexity: Using path compression can reduce the time complexity ofthe find operation, and usingunionby rank can reduce the time complexityof theunionoperation. These optimizations reduce the average timecomplexity of each find andunionoperation to O(α(n)), where α(n) is theinverse Ackermann function. This can be roughly considered a constant timecomplexity for practical purposes.
This document covers use of the Linux union-find implementation. For moreinformation on the nature and implementation of union-find, see:
- Wikipedia entry on union-find
Linux implementation of union-find¶
Linux’s union-find implementation resides in the file “lib/union_find.c”.To use it, “#include <linux/union_find.h>”.
The union-find data structure is defined as follows:
struct uf_node { struct uf_node *parent; unsigned int rank;};In this structure, parent points to the parent node of the current node.The rank field represents the height of the current tree. During aunionoperation, the tree with the smaller rank is attached under the tree with thelarger rank to maintain balance.
Initializing union-find¶
You can complete the initialization using either static or initializationinterface. Initialize the parent pointer to point to itself and set the rankto 0.Example:
struct uf_node my_node = UF_INIT_NODE(my_node);
or
uf_node_init(&my_node);
Find the Root Node of union-find¶
This operation is mainly used to determine whether two nodes belong to the sameset in the union-find. If they have the same root, they are in the same set.During the find operation, path compression is performed to improve theefficiency of subsequent find operations.Example:
int connected;struct uf_node *root1 = uf_find(&node_1);struct uf_node *root2 = uf_find(&node_2);if (root1 == root2) connected = 1;else connected = 0;
Union Two Sets in union-find¶
Touniontwo sets in the union-find, you first find their respective root nodesand then link the smaller node to the larger node based on the rank of the rootnodes.Example:
uf_union(&node_1, &node_2);