Disjoint sets

Jyothi
3 min readApr 11, 2022

This is one important concept for grouping problems. I read all, then started doing the first problem and confused like how to do it.

This concept is related to graphs or not graphs but there are a set of nodes or vertices where some vertices are grouped and some or not, means some are not joined or some are disjoined :). So to find these disjoint nodes or disjoint vertices, this disjoint sets data structure is useful.

This disjoint sets data structure consists of an array elements (root array) where each element corresponding to each vertex. This array element either points to itself or points to parent vertex or points to root vertex. This implementation can also consists of rank array indicating the rank of each vertex. This is used in optimization.

There are 4 functions:

  • Initialization: Each array element in root array is initialized to the corresponding vertex value. Each element in rank array is initialized to 1. Pseudo code:
for (ii=0; ii<num; ii++){
root[ii]=ii;
rnk[ii]=1;
}
  • Find(): There are different implementations of it. Below is the optimized version. The purpose of this function is to return the root node of a given vertex.
int find(int *rt, int x){
int rx = rt[x];
if (rx == x) return x;
return (rt[x] = find(rt, rt[x]));
}

This above function tries to find the root/parent of the vertex and also update to the root value.

  • Union(): There are different implementations to this function. Below is the optimized version. The purpose of this function is to join the given vertices (x,y) as making one vertex as another vertex’s parent and setting the root of the node to the corresponding vertex:
void union(int *rt, int *rnk, int x, int y){
int rx = find(rt,x);
int ry = find(rt,y);
if (rx != ry){
if(rnk[rx] > rnk[ry])
rt[ry]=rx;
else if (rnk[ry] > rnk[rx])
rt[rx] = ry;
else{
rt[ry]=rx;
rnk[rx] +=1;
}
}
return;
}
  • Connected(): The purpose of this function is to check if the given vertices are connected (or joined or disjoined).
bool connected(int x, int y) {
return find(x) == find(y);
}

Eg: (0,4),(1,5),(4,12),(5,12) are given.

(0,4)==> setting root[4] to 0

(1,5)==> setting root[5] to 1

(4,12) ==> setting root[12] to 0

(6,11)==> setting root[11] to 6

(6,12)==> setting root[6] to 0,

(11,12)==> setting root[11] to 0

Number of Provinces

Below is the problem stmt:

There are n cities. Some of them are connected, while some are not. If city a is connected directly with city b, and city b is connected directly with city c, then city a is connected indirectly with city c.

A province is a group of directly or indirectly connected cities and no other cities outside of the group.

You are given an n x n matrix isConnected where isConnected[i][j] = 1 if the ith city and the jth city are directly connected, and isConnected[i][j] = 0 otherwise.

Return the total number of provinces.

Even though I learned the above disjoint sets concept, again I was thinking how to find. This problem is to check how many disjoint sets are there :)

Based on the 2 d array, frame a disjoint set and then check how many disjoint sets are there (with root values). Below is the solution:\

int find(int *rt, int x){
int rx = rt[x];
if (rx == x) return x;
return (rt[x] = find(rt, rt[x]));
}
void unin(int *rt, int *rnk, int x, int y){
int rx = find(rt,x);
int ry = find(rt, y);
if (rx != ry){
/* if rank of rootX is more, set the root[Y] to rootX */
if(rnk[rx] > rnk[ry])
rt[ry]=rx;
/* if rank of rootY is more, set the root[X] to rootY */
else if (rnk[ry] > rnk[rx])
rt[rx] = ry;
/* if both ranks same, set root[Y] to root X, then incrementing rank of root X */
else{
rt[ry]=rx;
rnk[rx] +=1;
}
}
return;
}

int findCircleNum(int** isConnected, int cs, int* isConnectedColSize){
int np = 0,grp;
int dsra[cs], rnk[cs];
int ii,jj;
for (ii=0; ii<cs; ii++){
dsra[ii]=ii;
rnk[ii]=1;
}
for(ii=0; ii<cs; ii++){
for(jj=0; jj<isConnectedColSize[ii]; jj++){
if (isConnected[ii][jj]) {
unin(dsra,rnk,ii,jj);
}
}
}
/* check how many root nodes are there,
each root node can be considered as a province.
With the above implementation, for each group , there will be only one root node */
for (ii=0; ii<cs; ii++){
if (dsra[ii] == ii)
np++;
}
return np;
}

Please give your comments.

--

--