Skip to content
← All posts

Image Overlap: Translation Vector Counting

6 min read
leetcodeproblemmediumarraysmatrix

Given two binary images represented as n x n matrices of 0s and 1s, you can slide one image over the other in any direction. The overlap is the number of positions where both images have a 1. Find the translation that maximizes the overlap and return that maximum count. That is LeetCode 835: Image Overlap.

The problem

You are given two n x n binary matrices img1 and img2. A translation slides img1 by some number of rows and columns (positive, negative, or zero). After the translation, you count how many positions have a 1 in both the shifted img1 and the original img2. Return the largest possible overlap.

img1 = [[1,1,0],[0,1,0],[0,1,0]]
img2 = [[0,0,0],[0,1,1],[0,0,1]]
# Output: 3

If you shift img1 down by 1 row and right by 1 column, three of its 1-cells land on 1-cells in img2.

img1110010010img2000011001
Two 3x3 binary images. Blue cells are 1s. The goal is to find the translation that maximizes the number of overlapping 1s.

The brute force approach tries every possible translation. Since the image is n x n, translations range from -(n-1) to n-1 in both directions, giving O(n^2) possible translations. For each translation, you scan every cell to count overlapping 1s in O(n^2) time. That is O(n^4) total, which works for small inputs but gets slow as n grows.

There is a much better way.

The key insight: count translation vectors

Instead of looping over all possible translations and checking every cell, flip the problem around. For each 1 in img1 and each 1 in img2, compute the translation vector that would move the first 1 onto the second. If multiple pairs produce the same vector, that translation achieves an overlap equal to the number of such pairs.

The answer is the maximum frequency among all translation vectors.

Why does this work? A translation (dx, dy) means "shift img1 by dx rows and dy columns." After this shift, img1[r][c] lands on position (r + dx, c + dy). For that shifted cell to overlap with a 1 in img2, you need img2[r + dx][c + dy] == 1. Rearranging, the vector (r2 - r1, c2 - c1) where (r1, c1) is from img1 and (r2, c2) is from img2 tells you exactly which translation maps (r1, c1) to (r2, c2). Count how many pairs share the same vector, and you have the overlap for that translation.

The solution

from collections import Counter

def largest_overlap(img1: list[list[int]], img2: list[list[int]]) -> int:
    n = len(img1)

    ones1 = [(r, c) for r in range(n) for c in range(n) if img1[r][c]]
    ones2 = [(r, c) for r in range(n) for c in range(n) if img2[r][c]]

    if not ones1 or not ones2:
        return 0

    counts = Counter(
        (r2 - r1, c2 - c1)
        for r1, c1 in ones1
        for r2, c2 in ones2
    )

    return max(counts.values())

Step by step walkthrough

Let's trace through the example with img1 = [[1,1,0],[0,1,0],[0,1,0]] and img2 = [[0,0,0],[0,1,1],[0,0,1]].

Step 1: Extract positions of all 1s in both images.

img1:(0,0)(0,1)(1,1)(2,1)img2:(1,1)(1,2)(2,2)

img1 has four 1-cells. img2 has three 1-cells. We store their (row, col) coordinates.

Step 2: Compute translation vector (r2 - r1, c2 - c1) for every pair.

(dx, dy)Count(1, 1)3(1, 2)1(2, 2)1(1, 0)1(2, 1)1(0, 0)1(0, 1)2(-1, 0)1(-1, 1)1

Each pair of 1-positions (one from img1, one from img2) produces a vector. 4 x 3 = 12 pairs total.

Step 3: Find the most frequent translation vector.

best vector = (1, 1), count = 3

Translation (1, 1) appears 3 times, more than any other. This means shifting img1 down by 1 and right by 1 produces the maximum overlap.

Step 4: The maximum overlap is 3.

overlap = 3000011001

When img1 is shifted by (1, 1), three of its 1-cells land on 1-cells in img2. The cells at (1,1), (1,2), and (2,2) all overlap.

The translation (1, 1) appears three times because three different 1-cells in img1 can each be mapped onto a distinct 1-cell in img2 by the same shift. That is the maximum overlap.

Complexity analysis

MetricValue
TimeO(n^4) worst case, O(a * b) where a and b are the number of 1s
SpaceO(a * b) for the counter

Time: You iterate over every pair of 1-positions. In the worst case (all cells are 1), a = b = n^2, giving O(n^4). In practice, when the images are sparse, this is much faster than the brute force because you skip all the 0-cells entirely.

Space: The counter stores up to a * b entries. Each entry is a tuple of two integers plus a count.

The key difference from brute force is that sparse images run fast. If each image has only k ones where k is much smaller than n^2, the runtime is O(k^2) instead of O(n^4).

The building blocks

This problem combines three reusable techniques that appear across many LeetCode problems:

Coordinate extraction

Instead of working with the full 2D grid, extract only the cells you care about. This converts a grid problem into a list-of-points problem.

ones = [(r, c) for r in range(n) for c in range(n) if grid[r][c]]

This pattern shows up in any grid problem where most cells are irrelevant (zeroes, empty, water, etc.). By extracting coordinates, you skip empty cells entirely and reduce the problem to operations on a smaller set of points.

Translation vectors

Comparing two point sets by computing all pairwise difference vectors is a powerful geometric technique. Each vector (r2 - r1, c2 - c1) encodes a rigid translation. If many pairs share the same vector, those pairs are all consistent with the same global shift.

vectors = [(r2 - r1, c2 - c1) for r1, c1 in set1 for r2, c2 in set2]

You will see this idea in problems involving pattern matching on grids, detecting repeated structures, and convolution-style operations.

Hash map counting

Once you have a collection of items (here, translation vectors), finding the most frequent one is a classic hash map problem.

from collections import Counter
counts = Counter(items)
best = max(counts.values())

This is the same building block used in majority element, group anagrams, and dozens of other frequency-based problems. The Counter reduces a "find the mode" question to a single pass through the data.

Edge cases

  • No 1s in one image: if either image is all zeroes, the overlap is always 0 regardless of the translation. The early return handles this.
  • Identical images: the zero translation (0, 0) will appear a times (once for each 1-cell matching itself), giving an overlap equal to the number of 1s.
  • Single cell: a 1 x 1 matrix. If both are [[1]], the answer is 1. If either is [[0]], the answer is 0.
  • All 1s: every cell is 1 in both images. The zero translation gives n^2 overlapping cells, which is always the maximum.
  • No possible overlap: if the 1s in img1 and img2 are arranged so that no translation can align any pair, every vector appears exactly once and the answer is 1 (as long as both images have at least one 1).

From understanding to recall

You now understand the translation vector approach. The solution is short, but the insight behind it (that counting pairwise difference vectors is equivalent to trying all translations) is the part that needs to stick. Under interview pressure, it is easy to default to the brute force nested loops instead of seeing the hash map shortcut.

Spaced repetition locks this in. You practice extracting coordinates, computing pairwise vectors, and counting with a hash map. Do it today, again in two days, again in five. After a few reps, you see "image overlap" and the coordinate-pair-to-Counter pipeline just flows out. No fumbling with four nested loops or off-by-one translation bounds.

Coordinate extraction, translation vectors, and hash map counting are three of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Learning these individually and drilling them with spaced repetition is far more effective than grinding random problems and hoping the patterns stick.

Related posts

  • Rotate Image - Another matrix problem where thinking about coordinate transformations is the key to a clean solution
  • Set Matrix Zeroes - In-place matrix manipulation using coordinate-based reasoning about which rows and columns to modify
  • Game of Life - A grid problem where encoding state transitions and scanning neighbors uses similar 2D coordinate thinking

CodeBricks breaks Image Overlap into its coordinate extraction, translation vector, and hash map counting building blocks, then drills them independently with spaced repetition. You type the list comprehension for extracting 1-positions, the nested loop for pairwise vectors, and the Counter lookup from scratch until the pattern is automatic. When a coordinate-counting problem shows up in your interview, you do not think about it. You just write it.