Skip to content
← All posts

Count Sub Islands: Grid DFS with Cross-Reference

5 min read
leetcodeproblemmediummatrixgraph

You are given two m x n binary matrices grid1 and grid2. An island is a group of 1s connected 4-directionally (up, down, left, right). An island in grid2 is considered a sub-island if every cell in that island also corresponds to a 1 in grid1. Return the number of islands in grid2 that are sub-islands.

This is LeetCode 1905: Count Sub Islands, and it builds directly on the Number of Islands pattern. The twist: you are not just finding islands, you are validating each one against a second grid.

grid1grid211100011110000010000110111110000111010001000011011
Grid1 (left) shows the reference islands. Grid2 (right) shows green sub-islands (all cells overlap grid1) and one red non-sub-island at (2,1) where grid1 has water. Answer: 3 sub-islands.

Why this problem matters

Count Sub Islands teaches you to combine two fundamental techniques: grid DFS and cross-referencing external data during traversal. In Number of Islands, you explore and mark. Here, you explore, mark, and validate against a second source of truth at every step.

This cross-reference pattern shows up often in real systems. Think of validating user permissions while traversing a resource graph, or checking a cache while walking a data structure. The core idea is the same: traverse one structure, but consult another to make decisions along the way.

The approach: DFS with cross-reference

The algorithm follows three steps:

  1. Scan every cell in grid2. When you find an unvisited 1, you have found an island.
  2. Run DFS from that cell, visiting all connected 1s in grid2. At each cell, check whether grid1 also has a 1 at that position. If any cell fails this check, the island is not a sub-island.
  3. After the DFS finishes, if every cell passed the check, increment your sub-island count.

The critical detail: even when you discover a cell where grid1 has a 0, you must continue the DFS to completion. If you stop early, unvisited cells from that island could be mistakenly counted as part of a new island later.

Do not return early from DFS when you find a mismatch. You must visit every cell in the island to mark them all as visited, even if the island is already disqualified. Early termination causes incorrect island counting.

def count_sub_islands(grid1: list[list[int]], grid2: list[list[int]]) -> int:
    rows, cols = len(grid1), len(grid1[0])

    def dfs(r, c):
        if r < 0 or r >= rows or c < 0 or c >= cols:
            return True
        if grid2[r][c] != 1:
            return True

        grid2[r][c] = 0
        is_sub = grid1[r][c] == 1

        for dr, dc in [(1, 0), (-1, 0), (0, 1), (0, -1)]:
            if not dfs(r + dr, c + dc):
                is_sub = False

        return is_sub

    count = 0
    for r in range(rows):
        for c in range(cols):
            if grid2[r][c] == 1:
                if dfs(r, c):
                    count += 1

    return count

Notice how is_sub is set to False when grid1[r][c] != 1, but the DFS keeps going. Each recursive call returns whether its subtree was fully valid, and we combine results with an and-like pattern. The trick is to avoid short-circuit evaluation. Writing is_sub = dfs(...) and is_sub would short-circuit and skip DFS calls when is_sub is already False. Instead, we call DFS first, then update is_sub with the result.

The order matters: if not dfs(r + dr, c + dc): is_sub = False ensures the DFS call always runs. If you wrote is_sub = is_sub and dfs(...), Python's short-circuit evaluation would skip the DFS when is_sub is already False, leaving cells unvisited.

Initial state: two grids, land cells highlighted

grid1grid211101100001100101100010001110010

We scan grid2 for islands, cross-referencing each cell against grid1. sub_islands = 0.

Step 1: Scan finds (0,0) in grid2. Start DFS. grid1[0][0] = 1. Valid so far.

grid1grid211101100001100101100010001110010

is_sub = True. DFS explores from (0,0).

Step 2: DFS visits (0,1). grid1[0][1] = 1. Still valid.

grid1grid211101100001100101100010001110010

is_sub remains True. Both cells match grid1.

Step 3: DFS visits (1,1). grid1[1][1] = 1. Still valid.

grid1grid211101100001100101100010001110010

is_sub remains True. All three cells exist in grid1.

Step 4: No more neighbors. All cells matched grid1. Sub-island confirmed!

grid1grid211101100001100101100010001110010

sub_islands = 1. Island A is a sub-island (green).

Step 5: Scan finds (2,1) in grid2. Start DFS. grid1[2][1] = 0. Mismatch!

grid1grid211101100001100101100010001110010

is_sub = False. But we must finish the DFS to mark all cells visited.

Step 6: DFS visits (2,2). grid1[2][2] = 1, but island is already disqualified.

grid1grid211101100001100101100010001110010

is_sub stays False. One bad cell ruins the whole island.

Step 7: DFS visits (2,3) and (3,2). Finishing the exploration.

grid1grid211101100001100101100010001110010

is_sub stays False. We must visit every cell so they are not counted again later.

Step 8: Island B fully explored. NOT a sub-island (red). Final answer: 1.

grid1grid211101100001100101100010001110010

sub_islands = 1. Only Island A qualified. Done!

Complexity analysis

ApproachTimeSpace
DFSO(m * n)O(m * n)

Time is O(m * n) where m is the number of rows and n is the number of columns. Every cell in grid2 is visited at most once by the outer scan loop and at most once by a DFS call. Checking grid1 at each cell is O(1).

Space is O(m * n) in the worst case for the recursion call stack. If the entire grid is land, the DFS recurses through every cell. You can reduce this by using BFS with a queue, which limits the queue size to O(min(m, n)), but the recursive DFS is cleaner for interviews.

Building blocks

This problem combines two reusable patterns:

1. Grid DFS flood fill. The same explore-and-mark pattern from Number of Islands. You scan for unvisited land, run DFS to visit the entire connected component, and mark cells as visited by setting them to 0.

2. Cross-reference validation during traversal. At each cell, you consult a second data source (grid1) to validate the current cell. You accumulate a boolean result across the entire traversal. This is the new piece that Count Sub Islands adds on top of the basic flood fill.

The combination is what makes this a medium-difficulty problem. Each piece alone is simple. Together, they require careful handling of the short-circuit issue and the "must finish DFS even when disqualified" constraint.

Edge cases

  • grid2 has no islands: every cell is 0. Return 0.
  • Every island in grid2 is a sub-island: all land cells in grid2 also exist in grid1. Return the number of islands in grid2.
  • No sub-islands: every island in grid2 has at least one cell where grid1 is 0. Return 0.
  • Single cell islands: a 1 in grid2 at position (r, c) is a sub-island if and only if grid1[r][c] == 1.
  • grid2 is a subset of grid1: if grid2 has 1s only where grid1 does (but not necessarily all of them), every island in grid2 is a sub-island.
  • Grids are all 1s: both grids are entirely land. grid2 has one island, grid1 contains it. Return 1.

From understanding to recall

You understand the algorithm now: scan grid2, DFS each island, cross-reference against grid1, avoid short-circuit evaluation. But writing it correctly under pressure requires recalling the details. Which order do you check is_sub? How do you avoid the short-circuit bug? Do you return early or keep going?

These are exactly the kinds of details that spaced repetition drills into permanent memory. You practice writing the DFS function from scratch, and each time you reinforce the pattern: always call DFS before updating the boolean, never bail out early. After a few rounds, the correct structure flows out automatically.

Related posts