Rank Transform of a Matrix: Union-Find with Greedy Ranking

June 2, 20268 min read

leetcodeproblemhardmatrixunion-findsorting

Given an m x n matrix, you need to return a new matrix where each cell contains the rank of the original value. Ranks start at 1, and within the same row or column, a larger value must have a strictly higher rank. If two cells share the same value and the same row or column, they must receive the same rank. The goal is to make ranks as small as possible.

This is LeetCode 1632: Rank Transform of a Matrix, one of the harder union-find problems. The twist compared to simpler ranking problems is that equal values in the same row or column create constraints that chain together, and union-find is the cleanest way to handle those chains.

The problem

You are given an m x n integer matrix. Return a matrix answer where answer[i][j] is the rank of matrix[i][j].

The rank rules:

Ranks are positive integers starting from 1.
If two elements are in the same row or column, the element with the larger value gets a higher rank.
If two elements in the same row or column have the same value, they get the same rank.
Ranks should be as small as possible while satisfying the above constraints.

Each cell gets a rank starting from 1. Larger values in the same row or column get higher ranks. Ranks are as small as possible.

The brute force approach

A naive approach would try to assign ranks by repeatedly scanning the matrix. For each cell, look at every other cell in the same row and column, figure out the relative ordering, and assign ranks accordingly. You could model this as a constraint graph and try to resolve it with topological sort or repeated passes.

The problem is that equal values in the same row or column create groups that must share a rank, and those groups can chain across rows and columns. A cell at (0,1) with value 5 might need the same rank as (0,3) with value 5 (same row), which in turn needs the same rank as (2,3) with value 5 (same column). Building and resolving these chains manually is error-prone and slow.

Without a clean grouping mechanism, you end up with O(m * n * (m + n)) work per pass and potentially multiple passes. This gets ugly fast for large matrices.

The key insight

Process cells in ascending order of value. For each value, use union-find to group all cells with that value that share a row or column. Every cell in the same group gets the same rank, which equals 1 + the maximum rank already assigned in any of their rows or columns. This greedy approach guarantees ranks are as small as possible.

The key observations are:

Sorted order guarantees correctness. By processing smaller values first, when you assign a rank to a cell, every cell with a smaller value in the same row or column has already been ranked. The new rank just needs to exceed those.
Union-find handles same-value chains. If cells (0,1) and (0,3) both have value 5, they share row 0, so they must get the same rank. If (0,3) and (2,3) also both have value 5, they share column 3. Union-find merges all three into one component automatically.
One rank per component. Each union-find component of same-value cells gets a single rank: 1 + max of all row_max and col_max values across every cell in the component.

Step-by-step walkthrough

Step 1: Sort all cells by value

Collect every (value, row, col) triple and sort them in ascending order. We process cells from smallest to largest.

Sorted: (1, 1,2), (3, 2,2), (4, 1,1), (6, 0,2), (9, 0,1), (10, 1,0), (12, 2,1), (20, 0,0), (30, 2,0)

Sorting lets us assign ranks greedily: process the smallest values first, guaranteeing ranks stay minimal.

Step 2: Group cells with the same value

Cells with the same value that share a row or column must receive the same rank. Use union-find to group them.

Example: if cells (0,1) and (1,1) both have value 9, they share column 1, so they must be in the same union-find component and receive the same rank.

Union-find groups same-value cells that share a row or column. Each group gets one shared rank.

Step 3: Compute rank for each group

For each union-find component, the rank is 1 + max(row_max[i], col_max[j]) across all cells in the component. This ensures the rank is larger than every previously assigned rank in the same row or column.

Processing value 1 at (1,2): row_max[1]=0, col_max[2]=0. Rank = max(0,0)+1 = 1. Update row_max[1]=1, col_max[2]=1.

Cell (1,2) with value 1 gets rank 1. It is the smallest value, so nothing constrains it.

Step 4: Continue processing in sorted order

Process value 3 at (2,2): row_max[2]=0, col_max[2]=1 (from value 1). Rank = max(0,1)+1 = 2. Then value 4 at (1,1): row_max[1]=1, col_max[1]=0. Rank = max(1,0)+1 = 2.

After: row_max = [0,2,2], col_max = [0,2,2]. Values 6, 9, 10, 12, 20, 30 still pending.

Each rank respects all previously assigned ranks in its row and column.

Step 5: Final rank matrix

After processing all values in sorted order, every cell has its rank. The result satisfies all row and column constraints with the smallest possible ranks.

Final ranks: [[4,3,2],[3,2,1],[5,4,3]]. row_max = [4,3,5], col_max = [5,4,3].

All ranks are as small as possible while satisfying the ordering constraints.

Practice union-find ranking with spaced repetition

The code

from collections import defaultdict

class Solution:
    def matrixRankTransform(self, matrix):
        m, n = len(matrix), len(matrix[0])
        rank = [[0] * n for _ in range(m)]

        cells = []
        for i in range(m):
            for j in range(n):
                cells.append((matrix[i][j], i, j))
        cells.sort()

        row_max = [0] * m
        col_max = [0] * n

        groups = defaultdict(list)
        for val, i, j in cells:
            groups[val].append((i, j))

        for val in sorted(groups):
            parent = {}

            def find(x):
                while parent[x] != x:
                    parent[x] = parent[parent[x]]
                    x = parent[x]
                return x

            def union(a, b):
                a, b = find(a), find(b)
                if a != b:
                    parent[a] = b

            row_cells = defaultdict(list)
            col_cells = defaultdict(list)

            for i, j in groups[val]:
                parent[(i, j)] = (i, j)
                row_cells[i].append((i, j))
                col_cells[j].append((i, j))

            for cells_list in list(row_cells.values()) + list(col_cells.values()):
                for k in range(1, len(cells_list)):
                    union(cells_list[0], cells_list[k])

            components = defaultdict(list)
            for i, j in groups[val]:
                components[find((i, j))].append((i, j))

            for comp_cells in components.values():
                max_rank = max(max(row_max[i], col_max[j]) for i, j in comp_cells) + 1
                for i, j in comp_cells:
                    rank[i][j] = max_rank
                    row_max[i] = max_rank
                    col_max[j] = max_rank

        return rank

The algorithm has three phases per value group.

First, it builds a fresh union-find for only the cells with the current value. Each cell starts as its own parent. Then it unions cells that share a row and cells that share a column. Two cells with value 5 in row 0 get merged. Two cells with value 5 in column 3 get merged. Transitive chains are handled automatically by union-find.

Second, it collects the union-find components. Each component is a set of cells that must all receive the same rank.

Third, it computes the rank for each component. The rank is 1 plus the maximum of row_max[i] and col_max[j] across all cells (i, j) in the component. After assigning the rank, it updates row_max and col_max for every affected row and column.

The union-find is rebuilt from scratch for each distinct value. This is intentional. Cells with value 5 should not be in the same union-find structure as cells with value 3. Each value group is independent, and the only information that carries across groups is the row_max and col_max arrays.

Complexity analysis

Metric	Value
Time	O(m * n * log(m * n) + m * n * alpha(m * n)), effectively O(m * n * log(m * n)). Sorting dominates.
Space	O(m * n) for the rank matrix, sorted cells, and union-find structures.

The sorting step takes O(m * n * log(m * n)). The union-find operations across all value groups process each cell a constant number of times, with each find/union taking nearly O(1) due to path compression. The total work across all groups is O(m * n * alpha(m * n)), which is effectively linear.

Drill the union-find pattern until it is automatic

The building blocks

Union-find for grouping by shared attributes

The core pattern: for each group of same-value cells, initialize a union-find, then merge cells that share a row or column. This is the same concept used in Accounts Merge (merge accounts sharing an email) and Most Stones Removed (merge stones sharing a row or column).

parent = {}

def find(x):
    while parent[x] != x:
        parent[x] = parent[parent[x]]
        x = parent[x]
    return x

def union(a, b):
    a, b = find(a), find(b)
    if a != b:
        parent[a] = b

The key difference from standard union-find problems is that here you rebuild the union-find for each value group rather than maintaining a single global structure. This keeps the groups isolated so that cells with different values never get merged.

Greedy ranking by sorted order

Processing cells in ascending order of value is a greedy strategy. When you assign a rank to a cell, every smaller value in the same row and column already has its rank locked in. The new rank only needs to be 1 greater than the maximum existing rank in its row and column.

row_max = [0] * m
col_max = [0] * n

for val in sorted(groups):
    # ... union-find grouping ...
    for comp_cells in components.values():
        max_rank = max(max(row_max[i], col_max[j]) for i, j in comp_cells) + 1
        for i, j in comp_cells:
            rank[i][j] = max_rank
            row_max[i] = max_rank
            col_max[j] = max_rank

This pattern of "process in sorted order and greedily assign based on accumulated state" shows up in many problems. The row_max and col_max arrays act as a running summary of the constraints imposed by all previously processed values.

Edge cases

1x1 matrix: A single cell always gets rank 1. No row or column neighbors to consider.
All identical values: Every cell in the same row gets unioned together, and every cell in the same column gets unioned together. If the matrix is fully connected through shared rows and columns, all cells get rank 1.
All distinct values: No unions happen. Each cell gets a rank based purely on row_max and col_max. The problem reduces to a simpler greedy assignment.
Single row or single column: All cells are in the same row (or column), so the ranking is determined by sorting the values. Equal values get the same rank.
Large matrices: The algorithm handles m, n up to 500 comfortably because sorting O(m * n) elements is the bottleneck, and union-find operations are nearly linear.
Negative values: The algorithm works with negative values because it only compares relative ordering. Sorting handles negatives correctly.

From understanding to recall

You have traced through the sorted-order processing, the per-value union-find grouping, and the greedy rank assignment. The logic makes sense when you read it step by step. But in an interview, you need to remember to sort first, rebuild the union-find per value group, union by shared rows and columns, compute the component rank from row_max and col_max, and update both arrays afterward. Those are a lot of moving pieces.

The gap between reading a solution and reproducing it under pressure is real. Spaced repetition closes that gap. You practice writing the union-find template, the grouping logic, and the greedy ranking loop from scratch at increasing intervals. After a few rounds, the structure of the solution comes to you automatically. You see "rank transform" and immediately think "sort, group by value, union-find for shared rows/columns, greedy max-rank assignment."

Build recall with CodeBricks

Accounts Merge - Union-find to group accounts sharing an email, the same "merge by shared attribute" pattern
Redundant Connection - Union-find for cycle detection in a graph, a simpler introduction to the data structure
Most Stones Removed with Same Row or Column - Union-find on coordinate-based grouping, connecting stones by shared rows and columns

CodeBricks breaks Rank Transform of a Matrix into its union-find and greedy-ranking building blocks, then drills them independently with spaced repetition. You practice find with path compression, per-value grouping, and the row_max/col_max update pattern from scratch until it is automatic. When a matrix ranking problem shows up in your interview, you do not hesitate. You just write it.

Try CodeBricks free