Count Square Submatrices with All Ones: DP on a Grid

May 28, 20266 min read

leetcodeproblemmediumarraysdynamic-programmingmatrix

You are given an m x n binary matrix of ones and zeros. Return the total number of square submatrices that contain all ones.

This is LeetCode 1277: Count Square Submatrices with All Ones. If you have already solved Maximal Square (LeetCode 221), you will find that this problem uses the exact same DP recurrence. The only difference is what you do with the DP table at the end: Maximal Square asks for the max value, while this problem asks for the sum.

A 3x4 binary matrix and its DP table. Each dp[i][j] is the side length of the largest square ending at that cell. Summing all dp values gives 15, the total count of all-ones square submatrices.

Why this problem matters

This problem reinforces one of the most reusable DP patterns on a 2D grid: define dp[i][j] as the size of the largest square with its bottom-right corner at (i, j), then fill it using the min-of-three-neighbors recurrence. Once you have that table, you can answer different questions just by changing the aggregation step. That flexibility is what makes this pattern worth drilling.

The brute force

The naive approach tries every possible square in the matrix. For each cell as a top-left corner, expand outward to increasing square sizes and verify that every cell inside is a 1. This runs in O(m * n * min(m, n)^2) time, which is too slow for the constraints (m, n up to 300).

There is a much better way. Instead of checking squares from the outside, you build up square sizes from the inside using dynamic programming.

The key insight: dp[i][j] tells you the side length of the largest all-ones square whose bottom-right corner sits at (i, j). But it also tells you the count of all-ones squares ending at that corner. If dp[i][j] = 3, there is a 1x1, a 2x2, and a 3x3 square ending there. That is 3 squares total.

The DP recurrence

Define dp[i][j] as the side length of the largest all-ones square whose bottom-right corner is at (i, j).

If matrix[i][j] == 0, then dp[i][j] = 0. No square can end at a zero cell.

If matrix[i][j] == 1:

dp[i][j] = min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) + 1

The answer is the sum of all dp[i][j] values across the entire table.

Why the min of three neighbors works

For a k x k square to have its bottom-right corner at (i, j), three conditions must all hold:

There is a (k-1) x (k-1) square ending at (i-1, j) (the cell above). This covers the top portion.
There is a (k-1) x (k-1) square ending at (i, j-1) (the cell to the left). This covers the left portion.
There is a (k-1) x (k-1) square ending at (i-1, j-1) (the diagonal). This covers the top-left interior.

If any of these three neighbors is smaller than k-1, you cannot form a k x k square. The weakest neighbor is the bottleneck. That is why you take the minimum and add 1.

Why summing gives the count

Here is the part that connects this problem to Maximal Square. If dp[i][j] = 3, that means a 3x3 all-ones square ends at (i, j). But a 3x3 square contains a 2x2 square and a 1x1 square at the same corner. So cell (i, j) contributes exactly 3 squares. In general, dp[i][j] equals the number of all-ones squares with their bottom-right corner at (i, j). Summing across the entire table counts every such square exactly once.

Practice this pattern

The Python solution

def countSquares(matrix):
    m, n = len(matrix), len(matrix[0])
    dp = [[0] * n for _ in range(m)]
    total = 0

    for i in range(m):
        for j in range(n):
            if matrix[i][j] == 1:
                if i == 0 or j == 0:
                    dp[i][j] = 1
                else:
                    dp[i][j] = min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) + 1
                total += dp[i][j]

    return total

The entire logic fits in a few lines. The recurrence is min(top, left, diagonal) + 1, and you accumulate each dp value into total as you go.

Walking through the DP table

Let's trace the algorithm on the matrix [[0,1,1,1],[1,1,1,1],[0,1,1,1]]. Watch how the three neighbors determine each cell's value, and notice how each dp value represents a count of squares.

Step 1: Base case. First row and first column copy directly from the matrix.

Cells in the first row or first column can be at most 1x1 squares. dp[i][j] = matrix[i][j].

Step 2: dp[1][1]. matrix[1][1] = 1, so dp = min(top, left, diag) + 1 = min(1, 1, 0) + 1 = 1.

The diagonal neighbor is 0 (matrix[0][0] = 0), so the bottleneck limits this to a 1x1 square.

Step 3: dp[1][2]. matrix[1][2] = 1, so dp = min(1, 1, 1) + 1 = 2. A 2x2 square ends here.

All three neighbors are 1. min(1, 1, 1) + 1 = 2. This cell contributes 2 to the total (one 1x1 and one 2x2 square).

Step 4: dp[2][3]. matrix[2][3] = 1, so dp = min(2, 2, 2) + 1 = 3. A 3x3 square ends here.

All three neighbors are 2. min(2, 2, 2) + 1 = 3. This cell alone contributes 3 to the total (a 1x1, a 2x2, and a 3x3 square).

Step 5: Final DP table. Sum all values: 0+1+1+1+1+1+2+2+0+1+2+3 = 15.

Each dp value tells you how many squares have their bottom-right corner at that cell. Sum them all for the answer: 15.

The sum of all dp values is 0+1+1+1+1+1+2+2+0+1+2+3 = 15. That is the answer.

Practice this pattern

Space optimization

You can compress the 2D DP table into a single row, just like in Maximal Square. Each cell depends only on the cell above, the cell to the left, and the diagonal. When processing row i, you only need the values from row i-1.

def countSquares(matrix):
    m, n = len(matrix), len(matrix[0])
    dp = [0] * n
    total = 0
    prev = 0

    for i in range(m):
        for j in range(n):
            temp = dp[j]
            if matrix[i][j] == 1:
                if i == 0 or j == 0:
                    dp[j] = 1
                else:
                    dp[j] = min(dp[j], dp[j-1], prev) + 1
                total += dp[j]
            else:
                dp[j] = 0
            prev = temp

    return total

Same O(m * n) time, but now O(n) space.

Complexity analysis

Approach	Time	Space
Brute force (check every square)	O(m * n * min(m,n)^2)	O(1)
2D DP table	O(m * n)	O(m * n)
1D DP (space optimized)	O(m * n)	O(n)

The DP solution visits every cell once and does O(1) work per cell. You cannot do better than O(m * n) because you must read every cell at least once.

The connection to Maximal Square

Maximal Square (LeetCode 221) uses the identical recurrence. The only difference is the final aggregation:

Maximal Square: return max(dp[i][j])^2 across all cells.
Count Square Submatrices: return sum(dp[i][j]) across all cells.

If you already know Maximal Square, you already know this problem. The recurrence, base cases, and space optimization are all the same. This is a great example of how one building block solves multiple problems with a tiny change.

Edge cases

All zeros: every dp[i][j] stays 0. Return 0.
All ones: dp values grow toward min(m, n) in the bottom-right region. The sum grows quickly.
Single row: no cell has a "top" neighbor. Every 1 produces dp[i][j] = 1. Return the count of ones in the row.
Single column: same idea. No cell has a "left" neighbor. Return the count of ones in the column.
1x1 matrix: return matrix[0][0].

The building blocks

Count Square Submatrices is built on the 2D DP with neighbor dependencies pattern. You define a table where each cell depends on a small fixed set of neighbors, fill it row by row, and aggregate the result.

The same pattern shows up in:

Maximal Square (LeetCode 221): identical recurrence, but asks for max instead of sum.
Unique Paths (LeetCode 62): each cell sums the cell above and the cell to the left. Different recurrence, same fill order.
Minimum Path Sum (LeetCode 64): dp[i][j] = grid[i][j] + min(dp[i-1][j], dp[i][j-1]). Same two-neighbor dependency.

The difference between these problems is the recurrence and the aggregation, not the structure. Once you can set up a 2D DP table, define the base cases, and fill it in order, you have the building block for all of them.

From understanding to recall

Reading through this solution, the recurrence makes sense. min(top, left, diagonal) + 1, then sum the table. The connection to Maximal Square clicks. But knowing the idea and producing it under pressure are different skills.

In an interview, you need to remember that the DP state is "side length of the largest square ending at this corner," write the three-neighbor min recurrence, handle the base cases for the first row and column, and remember to sum instead of taking the max. That takes practice, not just understanding.

Spaced repetition closes the gap. You revisit this problem at increasing intervals, writing the solution from scratch each time. After a few reps, you see a matrix counting problem and your hands know what to write. The recurrence becomes automatic.

The 2D neighbor-dependency DP pattern is one of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Drilling them individually with spaced repetition is far more effective than grinding problems at random.

Maximal Square - The identical DP recurrence, but asking for the max value instead of the sum
Unique Paths - The simplest 2D grid DP, teaching the same row-by-row fill pattern
Minimum Path Sum - Another 2D grid DP with neighbor dependencies

Practice this pattern

Why this problem matters

The brute force

The DP recurrence

Why the min of three neighbors works

Why summing gives the count

The Python solution

Walking through the DP table

Space optimization

Complexity analysis

The connection to Maximal Square

Edge cases

The building blocks

From understanding to recall

Related posts