Count Sub Islands: Grid DFS with Cross-Reference
You are given two m x n binary matrices grid1 and grid2. An island is a group of 1s connected 4-directionally (up, down, left, right). An island in grid2 is considered a sub-island if every cell in that island also corresponds to a 1 in grid1. Return the number of islands in grid2 that are sub-islands.
This is LeetCode 1905: Count Sub Islands, and it builds directly on the Number of Islands pattern. The twist: you are not just finding islands, you are validating each one against a second grid.
Why this problem matters
Count Sub Islands teaches you to combine two fundamental techniques: grid DFS and cross-referencing external data during traversal. In Number of Islands, you explore and mark. Here, you explore, mark, and validate against a second source of truth at every step.
This cross-reference pattern shows up often in real systems. Think of validating user permissions while traversing a resource graph, or checking a cache while walking a data structure. The core idea is the same: traverse one structure, but consult another to make decisions along the way.
The approach: DFS with cross-reference
The algorithm follows three steps:
- Scan every cell in
grid2. When you find an unvisited1, you have found an island. - Run DFS from that cell, visiting all connected
1s ingrid2. At each cell, check whethergrid1also has a1at that position. If any cell fails this check, the island is not a sub-island. - After the DFS finishes, if every cell passed the check, increment your sub-island count.
The critical detail: even when you discover a cell where grid1 has a 0, you must continue the DFS to completion. If you stop early, unvisited cells from that island could be mistakenly counted as part of a new island later.
Do not return early from DFS when you find a mismatch. You must visit every cell in the island to mark them all as visited, even if the island is already disqualified. Early termination causes incorrect island counting.
def count_sub_islands(grid1: list[list[int]], grid2: list[list[int]]) -> int:
rows, cols = len(grid1), len(grid1[0])
def dfs(r, c):
if r < 0 or r >= rows or c < 0 or c >= cols:
return True
if grid2[r][c] != 1:
return True
grid2[r][c] = 0
is_sub = grid1[r][c] == 1
for dr, dc in [(1, 0), (-1, 0), (0, 1), (0, -1)]:
if not dfs(r + dr, c + dc):
is_sub = False
return is_sub
count = 0
for r in range(rows):
for c in range(cols):
if grid2[r][c] == 1:
if dfs(r, c):
count += 1
return count
Notice how is_sub is set to False when grid1[r][c] != 1, but the DFS keeps going. Each recursive call returns whether its subtree was fully valid, and we combine results with an and-like pattern. The trick is to avoid short-circuit evaluation. Writing is_sub = dfs(...) and is_sub would short-circuit and skip DFS calls when is_sub is already False. Instead, we call DFS first, then update is_sub with the result.
The order matters: if not dfs(r + dr, c + dc): is_sub = False ensures the DFS call always runs. If you wrote is_sub = is_sub and dfs(...), Python's short-circuit evaluation would skip the DFS when is_sub is already False, leaving cells unvisited.
Initial state: two grids, land cells highlighted
We scan grid2 for islands, cross-referencing each cell against grid1. sub_islands = 0.
Step 1: Scan finds (0,0) in grid2. Start DFS. grid1[0][0] = 1. Valid so far.
is_sub = True. DFS explores from (0,0).
Step 2: DFS visits (0,1). grid1[0][1] = 1. Still valid.
is_sub remains True. Both cells match grid1.
Step 3: DFS visits (1,1). grid1[1][1] = 1. Still valid.
is_sub remains True. All three cells exist in grid1.
Step 4: No more neighbors. All cells matched grid1. Sub-island confirmed!
sub_islands = 1. Island A is a sub-island (green).
Step 5: Scan finds (2,1) in grid2. Start DFS. grid1[2][1] = 0. Mismatch!
is_sub = False. But we must finish the DFS to mark all cells visited.
Step 6: DFS visits (2,2). grid1[2][2] = 1, but island is already disqualified.
is_sub stays False. One bad cell ruins the whole island.
Step 7: DFS visits (2,3) and (3,2). Finishing the exploration.
is_sub stays False. We must visit every cell so they are not counted again later.
Step 8: Island B fully explored. NOT a sub-island (red). Final answer: 1.
sub_islands = 1. Only Island A qualified. Done!
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| DFS | O(m * n) | O(m * n) |
Time is O(m * n) where m is the number of rows and n is the number of columns. Every cell in grid2 is visited at most once by the outer scan loop and at most once by a DFS call. Checking grid1 at each cell is O(1).
Space is O(m * n) in the worst case for the recursion call stack. If the entire grid is land, the DFS recurses through every cell. You can reduce this by using BFS with a queue, which limits the queue size to O(min(m, n)), but the recursive DFS is cleaner for interviews.
Building blocks
This problem combines two reusable patterns:
1. Grid DFS flood fill. The same explore-and-mark pattern from Number of Islands. You scan for unvisited land, run DFS to visit the entire connected component, and mark cells as visited by setting them to 0.
2. Cross-reference validation during traversal. At each cell, you consult a second data source (grid1) to validate the current cell. You accumulate a boolean result across the entire traversal. This is the new piece that Count Sub Islands adds on top of the basic flood fill.
The combination is what makes this a medium-difficulty problem. Each piece alone is simple. Together, they require careful handling of the short-circuit issue and the "must finish DFS even when disqualified" constraint.
Edge cases
- grid2 has no islands: every cell is
0. Return 0. - Every island in grid2 is a sub-island: all land cells in
grid2also exist ingrid1. Return the number of islands ingrid2. - No sub-islands: every island in
grid2has at least one cell wheregrid1is0. Return 0. - Single cell islands: a
1ingrid2at position (r, c) is a sub-island if and only ifgrid1[r][c] == 1. - grid2 is a subset of grid1: if
grid2has1s only wheregrid1does (but not necessarily all of them), every island ingrid2is a sub-island. - Grids are all
1s: both grids are entirely land. grid2 has one island, grid1 contains it. Return 1.
From understanding to recall
You understand the algorithm now: scan grid2, DFS each island, cross-reference against grid1, avoid short-circuit evaluation. But writing it correctly under pressure requires recalling the details. Which order do you check is_sub? How do you avoid the short-circuit bug? Do you return early or keep going?
These are exactly the kinds of details that spaced repetition drills into permanent memory. You practice writing the DFS function from scratch, and each time you reinforce the pattern: always call DFS before updating the boolean, never bail out early. After a few rounds, the correct structure flows out automatically.
Related posts
- Number of Islands - the foundational grid DFS problem
- Max Area of Island - same grid DFS pattern with an accumulator
- Flood Fill - another grid DFS exploration problem