Number of Provinces: Union-Find for Connected Components
LeetCode's Number of Provinces (problem 547) asks you to count connected components in an undirected graph. It is rated medium, and it is one of the cleanest introductions to Union-Find you will find. The input is an adjacency matrix, the output is a single integer, and the core algorithm is reusable across dozens of graph problems.
The problem
You are given an n x n matrix isConnected where isConnected[i][j] = 1 means city i and city j are directly connected, and isConnected[i][j] = 0 means they are not. A province is a group of directly or indirectly connected cities. Return the total number of provinces.
Input: isConnected = [[1,1,0],[1,1,0],[0,0,1]]
Output: 2
Cities 0 and 1 are connected to each other, forming one province. City 2 is on its own, forming a second province.
Cities 0 and 1 are directly connected, forming one province. City 2 has no connections to others, forming a second province.
Why Union-Find?
You could solve this with DFS or BFS, and those work fine. But Union-Find is worth learning here because it directly models the operation the problem asks for: "merge two cities into the same group" and "count how many groups remain."
Union-Find (also called Disjoint Set Union) maintains a forest of trees where each tree is one connected component. It supports two operations:
- Find(x): follow parent pointers to the root of x's tree. The root identifies which component x belongs to.
- Union(x, y): merge the trees containing x and y by connecting one root to the other.
With path compression (flattening the tree during Find) and union by rank (attaching the shorter tree under the taller one), both operations run in nearly O(1) amortized time.
DFS also works in O(n^2) time for this problem. Union-Find is not faster here, but it teaches a pattern that generalizes to problems where edges arrive one at a time (like in Kruskal's algorithm or dynamic connectivity).
The solution
def findCircleNum(isConnected):
n = len(isConnected)
parent = list(range(n))
rank = [0] * n
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(x, y):
rx, ry = find(x), find(y)
if rx == ry:
return
if rank[rx] < rank[ry]:
rx, ry = ry, rx
parent[ry] = rx
if rank[rx] == rank[ry]:
rank[rx] += 1
for i in range(n):
for j in range(i + 1, n):
if isConnected[i][j] == 1:
union(i, j)
return len(set(find(i) for i in range(n)))
Let's break this down:
- Initialize
parent[i] = ifor every city. Each city starts as its own root. - Find uses path compression by setting
parent[x] = parent[parent[x]](path halving). This keeps trees flat. - Union merges two components by rank. The shorter tree goes under the taller one. If ranks are equal, the new root's rank increases by 1.
- Scan the upper triangle of the matrix (since
isConnected[i][j] == isConnected[j][i]). For each connection, callunion(i, j). - Count distinct roots by calling
find(i)for every city and counting unique values.
Step-by-step walkthrough
Let's trace Union-Find on isConnected = [[1,1,0],[1,1,0],[0,0,1]]. We only process the upper triangle: pairs (0,1), (0,2), and (1,2).
Initial state: each city is its own parent
Initialize parent = [0, 1, 2], rank = [0, 0, 0]
Every city starts in its own component. We have 3 separate sets.
Process edge (0, 1): isConnected[0][1] = 1
union(0, 1): root(0)=0, root(1)=1. Merge 1 under 0. rank[0] becomes 1.
Cities 0 and 1 are now in the same set. Parent of 1 points to 0.
Process edge (0, 2): isConnected[0][2] = 0
Skip: no connection between city 0 and city 2.
No union needed. The sets remain unchanged.
Process edge (1, 2): isConnected[1][2] = 0
Skip: no connection between city 1 and city 2.
No union needed. Still 2 distinct components.
Count distinct roots
Roots: find(0)=0, find(1)=0, find(2)=2. Unique roots: {0, 2}.
2 unique roots means 2 provinces. Final answer: 2.
Notice how the parent array tells the whole story. After processing all edges, cities that share the same root belong to the same province. Counting unique roots gives the answer.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| DFS / BFS | O(n^2) | O(n) |
| Union-Find (with path compression + union by rank) | O(n^2 * alpha(n)) | O(n) |
Both approaches scan the entire n x n matrix, so the dominant cost is O(n^2). The Union-Find operations themselves are nearly O(1) each thanks to path compression and union by rank. The inverse Ackermann function alpha(n) is effectively constant for any practical input size.
The space is O(n) for the parent and rank arrays (Union-Find) or the visited set (DFS).
The building blocks
This problem is built on one reusable brick:
Union-Find template
parent = list(range(n))
rank = [0] * n
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(x, y):
rx, ry = find(x), find(y)
if rx == ry:
return
if rank[rx] < rank[ry]:
rx, ry = ry, rx
parent[ry] = rx
if rank[rx] == rank[ry]:
rank[rx] += 1
This template appears in many problems with different wrappers:
- Number of Provinces: union on matrix entries, count roots
- Number of Islands (Union-Find variant): union adjacent land cells
- Redundant Connection: union edges and detect when a union would create a cycle
- Accounts Merge: union email addresses that share an account
The find and union functions stay the same. What changes is how you iterate over the input and what you do with the result.
If you can write the Union-Find template from memory, you can handle any problem that asks "how many connected components?" or "are these two nodes in the same group?" The template is about 15 lines and covers path compression and union by rank.
Edge cases
Before submitting, make sure your solution handles:
- All cities connected: a single province. Every union merges into one tree. The answer is 1.
- No cities connected: each city is its own province. No unions happen. The answer is n.
- Self-loops in the matrix:
isConnected[i][i]is always 1. Your loop should skip these sinceunion(i, i)is a no-op. Iteratingjfromi + 1avoids them naturally. - Single city:
n = 1, the answer is 1. The loop body never executes. - Two cities, connected:
isConnected = [[1,1],[1,1]], answer is 1. One union, one province. - Two cities, not connected:
isConnected = [[1,0],[0,1]], answer is 2. No unions.
The Union-Find solution handles all of these without special cases.
From understanding to recall
You have seen how Union-Find works for Number of Provinces. The template is clean: initialize parents, find with path compression, union by rank, count roots. But can you write it from scratch in an interview next month?
The tricky parts are small but easy to forget under pressure. The path compression line (parent[x] = parent[parent[x]]) is just one line, but getting it wrong means your find runs in O(n) instead of near-O(1). The rank comparison in union is easy to mix up. And remembering to scan only the upper triangle of the matrix is a detail that saves you from double-counting.
Spaced repetition is designed for exactly this. You write the Union-Find template from scratch at increasing intervals. Each time, it gets faster. After a few repetitions, the template is in your long-term memory and you can apply it to any connected-component problem without hesitation.
Related posts
- Number of Islands - The grid version of "count connected components," solved with DFS flood fill instead of Union-Find
- Course Schedule - Another graph problem, but with directed edges and cycle detection using topological sort
- Clone Graph - Graph traversal with BFS/DFS where you need to reconstruct the entire structure, not just count components
CodeBricks breaks Union-Find into its reusable pieces and drills them individually. You practice the find and union functions from scratch, building the muscle memory so that when you see Number of Provinces or any connected-component problem in an interview, the code flows without hesitation.