Rank Transform of a Matrix: Union-Find with Greedy Ranking
Given an m x n matrix, you need to return a new matrix where each cell contains the rank of the original value. Ranks start at 1, and within the same row or column, a larger value must have a strictly higher rank. If two cells share the same value and the same row or column, they must receive the same rank. The goal is to make ranks as small as possible.
This is LeetCode 1632: Rank Transform of a Matrix, one of the harder union-find problems. The twist compared to simpler ranking problems is that equal values in the same row or column create constraints that chain together, and union-find is the cleanest way to handle those chains.
The problem
You are given an m x n integer matrix. Return a matrix answer where answer[i][j] is the rank of matrix[i][j].
The rank rules:
- Ranks are positive integers starting from 1.
- If two elements are in the same row or column, the element with the larger value gets a higher rank.
- If two elements in the same row or column have the same value, they get the same rank.
- Ranks should be as small as possible while satisfying the above constraints.
The brute force approach
A naive approach would try to assign ranks by repeatedly scanning the matrix. For each cell, look at every other cell in the same row and column, figure out the relative ordering, and assign ranks accordingly. You could model this as a constraint graph and try to resolve it with topological sort or repeated passes.
The problem is that equal values in the same row or column create groups that must share a rank, and those groups can chain across rows and columns. A cell at (0,1) with value 5 might need the same rank as (0,3) with value 5 (same row), which in turn needs the same rank as (2,3) with value 5 (same column). Building and resolving these chains manually is error-prone and slow.
Without a clean grouping mechanism, you end up with O(m * n * (m + n)) work per pass and potentially multiple passes. This gets ugly fast for large matrices.
The key insight
Process cells in ascending order of value. For each value, use union-find to group all cells with that value that share a row or column. Every cell in the same group gets the same rank, which equals 1 + the maximum rank already assigned in any of their rows or columns. This greedy approach guarantees ranks are as small as possible.
The key observations are:
-
Sorted order guarantees correctness. By processing smaller values first, when you assign a rank to a cell, every cell with a smaller value in the same row or column has already been ranked. The new rank just needs to exceed those.
-
Union-find handles same-value chains. If cells (0,1) and (0,3) both have value 5, they share row 0, so they must get the same rank. If (0,3) and (2,3) also both have value 5, they share column 3. Union-find merges all three into one component automatically.
-
One rank per component. Each union-find component of same-value cells gets a single rank: 1 + max of all
row_maxandcol_maxvalues across every cell in the component.
Step-by-step walkthrough
Step 1: Sort all cells by value
Collect every (value, row, col) triple and sort them in ascending order. We process cells from smallest to largest.
Sorted: (1, 1,2), (3, 2,2), (4, 1,1), (6, 0,2), (9, 0,1), (10, 1,0), (12, 2,1), (20, 0,0), (30, 2,0)
Sorting lets us assign ranks greedily: process the smallest values first, guaranteeing ranks stay minimal.
Step 2: Group cells with the same value
Cells with the same value that share a row or column must receive the same rank. Use union-find to group them.
Example: if cells (0,1) and (1,1) both have value 9, they share column 1, so they must be in the same union-find component and receive the same rank.
Union-find groups same-value cells that share a row or column. Each group gets one shared rank.
Step 3: Compute rank for each group
For each union-find component, the rank is 1 + max(row_max[i], col_max[j]) across all cells in the component. This ensures the rank is larger than every previously assigned rank in the same row or column.
Processing value 1 at (1,2): row_max[1]=0, col_max[2]=0. Rank = max(0,0)+1 = 1. Update row_max[1]=1, col_max[2]=1.
Cell (1,2) with value 1 gets rank 1. It is the smallest value, so nothing constrains it.
Step 4: Continue processing in sorted order
Process value 3 at (2,2): row_max[2]=0, col_max[2]=1 (from value 1). Rank = max(0,1)+1 = 2. Then value 4 at (1,1): row_max[1]=1, col_max[1]=0. Rank = max(1,0)+1 = 2.
After: row_max = [0,2,2], col_max = [0,2,2]. Values 6, 9, 10, 12, 20, 30 still pending.
Each rank respects all previously assigned ranks in its row and column.
Step 5: Final rank matrix
After processing all values in sorted order, every cell has its rank. The result satisfies all row and column constraints with the smallest possible ranks.
Final ranks: [[4,3,2],[3,2,1],[5,4,3]]. row_max = [4,3,5], col_max = [5,4,3].
All ranks are as small as possible while satisfying the ordering constraints.
The code
from collections import defaultdict
class Solution:
def matrixRankTransform(self, matrix):
m, n = len(matrix), len(matrix[0])
rank = [[0] * n for _ in range(m)]
cells = []
for i in range(m):
for j in range(n):
cells.append((matrix[i][j], i, j))
cells.sort()
row_max = [0] * m
col_max = [0] * n
groups = defaultdict(list)
for val, i, j in cells:
groups[val].append((i, j))
for val in sorted(groups):
parent = {}
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(a, b):
a, b = find(a), find(b)
if a != b:
parent[a] = b
row_cells = defaultdict(list)
col_cells = defaultdict(list)
for i, j in groups[val]:
parent[(i, j)] = (i, j)
row_cells[i].append((i, j))
col_cells[j].append((i, j))
for cells_list in list(row_cells.values()) + list(col_cells.values()):
for k in range(1, len(cells_list)):
union(cells_list[0], cells_list[k])
components = defaultdict(list)
for i, j in groups[val]:
components[find((i, j))].append((i, j))
for comp_cells in components.values():
max_rank = max(max(row_max[i], col_max[j]) for i, j in comp_cells) + 1
for i, j in comp_cells:
rank[i][j] = max_rank
row_max[i] = max_rank
col_max[j] = max_rank
return rank
The algorithm has three phases per value group.
First, it builds a fresh union-find for only the cells with the current value. Each cell starts as its own parent. Then it unions cells that share a row and cells that share a column. Two cells with value 5 in row 0 get merged. Two cells with value 5 in column 3 get merged. Transitive chains are handled automatically by union-find.
Second, it collects the union-find components. Each component is a set of cells that must all receive the same rank.
Third, it computes the rank for each component. The rank is 1 plus the maximum of row_max[i] and col_max[j] across all cells (i, j) in the component. After assigning the rank, it updates row_max and col_max for every affected row and column.
The union-find is rebuilt from scratch for each distinct value. This is intentional. Cells with value 5 should not be in the same union-find structure as cells with value 3. Each value group is independent, and the only information that carries across groups is the row_max and col_max arrays.
Complexity analysis
| Metric | Value |
|---|---|
| Time | O(m * n * log(m * n) + m * n * alpha(m * n)), effectively O(m * n * log(m * n)). Sorting dominates. |
| Space | O(m * n) for the rank matrix, sorted cells, and union-find structures. |
The sorting step takes O(m * n * log(m * n)). The union-find operations across all value groups process each cell a constant number of times, with each find/union taking nearly O(1) due to path compression. The total work across all groups is O(m * n * alpha(m * n)), which is effectively linear.
The building blocks
Union-find for grouping by shared attributes
The core pattern: for each group of same-value cells, initialize a union-find, then merge cells that share a row or column. This is the same concept used in Accounts Merge (merge accounts sharing an email) and Most Stones Removed (merge stones sharing a row or column).
parent = {}
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(a, b):
a, b = find(a), find(b)
if a != b:
parent[a] = b
The key difference from standard union-find problems is that here you rebuild the union-find for each value group rather than maintaining a single global structure. This keeps the groups isolated so that cells with different values never get merged.
Greedy ranking by sorted order
Processing cells in ascending order of value is a greedy strategy. When you assign a rank to a cell, every smaller value in the same row and column already has its rank locked in. The new rank only needs to be 1 greater than the maximum existing rank in its row and column.
row_max = [0] * m
col_max = [0] * n
for val in sorted(groups):
# ... union-find grouping ...
for comp_cells in components.values():
max_rank = max(max(row_max[i], col_max[j]) for i, j in comp_cells) + 1
for i, j in comp_cells:
rank[i][j] = max_rank
row_max[i] = max_rank
col_max[j] = max_rank
This pattern of "process in sorted order and greedily assign based on accumulated state" shows up in many problems. The row_max and col_max arrays act as a running summary of the constraints imposed by all previously processed values.
Edge cases
- 1x1 matrix: A single cell always gets rank 1. No row or column neighbors to consider.
- All identical values: Every cell in the same row gets unioned together, and every cell in the same column gets unioned together. If the matrix is fully connected through shared rows and columns, all cells get rank 1.
- All distinct values: No unions happen. Each cell gets a rank based purely on
row_maxandcol_max. The problem reduces to a simpler greedy assignment. - Single row or single column: All cells are in the same row (or column), so the ranking is determined by sorting the values. Equal values get the same rank.
- Large matrices: The algorithm handles m, n up to 500 comfortably because sorting O(m * n) elements is the bottleneck, and union-find operations are nearly linear.
- Negative values: The algorithm works with negative values because it only compares relative ordering. Sorting handles negatives correctly.
From understanding to recall
You have traced through the sorted-order processing, the per-value union-find grouping, and the greedy rank assignment. The logic makes sense when you read it step by step. But in an interview, you need to remember to sort first, rebuild the union-find per value group, union by shared rows and columns, compute the component rank from row_max and col_max, and update both arrays afterward. Those are a lot of moving pieces.
The gap between reading a solution and reproducing it under pressure is real. Spaced repetition closes that gap. You practice writing the union-find template, the grouping logic, and the greedy ranking loop from scratch at increasing intervals. After a few rounds, the structure of the solution comes to you automatically. You see "rank transform" and immediately think "sort, group by value, union-find for shared rows/columns, greedy max-rank assignment."
Related posts
- Accounts Merge - Union-find to group accounts sharing an email, the same "merge by shared attribute" pattern
- Redundant Connection - Union-find for cycle detection in a graph, a simpler introduction to the data structure
- Most Stones Removed with Same Row or Column - Union-find on coordinate-based grouping, connecting stones by shared rows and columns
CodeBricks breaks Rank Transform of a Matrix into its union-find and greedy-ranking building blocks, then drills them independently with spaced repetition. You practice find with path compression, per-value grouping, and the row_max/col_max update pattern from scratch until it is automatic. When a matrix ranking problem shows up in your interview, you do not hesitate. You just write it.