Count Substrings That Differ by One Character
Given two strings s and t, find the number of ways to choose a non-empty substring of s and a non-empty substring of t such that the two substrings differ in exactly one position. This is LeetCode 1638: Count Substrings That Differ by One Character. It tests whether you can think about string comparisons from the perspective of the mismatch position rather than the substrings themselves.
The problem
You are given two strings s and t. Count the number of pairs of substrings (s[a..b], t[c..d]) where the two substrings have the same length and differ at exactly one position. The substrings must be contiguous.
For example:
s = "aba",t = "baba"returns 6.s = "ab",t = "bb"returns 3.
The lengths of s and t can be up to 100, so even an O(n^3) solution can pass. But the real insight is an O(n^2) approach that avoids checking every possible pair of substrings.
The brute force approach
The most direct way is to enumerate every possible starting position in both strings, then for each pair of same-length substrings, count the number of positions where they differ. If exactly one position differs, add one to the answer.
def count_substrings_brute(s, t):
count = 0
for i in range(len(s)):
for j in range(len(t)):
diff = 0
k = 0
while i + k < len(s) and j + k < len(t):
if s[i + k] != t[j + k]:
diff += 1
if diff == 1:
count += 1
if diff > 1:
break
k += 1
return count
This runs in O(m * n * min(m, n)) time. It works for the given constraints, but the approach does not scale and does not reveal the underlying structure of the problem.
The key insight
Instead of iterating over all pairs of substrings and checking how many differences they have, flip the perspective. Fix the position where the mismatch happens and count how far you can extend the matching characters on both sides.
For a given pair (i, j) where s[i] != t[j], you know this will be the single differing character. Now count how many consecutive matching characters extend to the left (call that left) and to the right (call that right). Every combination of a left extension and a right extension produces one valid pair of substrings. So the contribution from this mismatch point is left * right.
Think of it as anchoring the one allowed mismatch. Once you fix where the difference is, the number of valid substring pairs from that anchor is just the product of how far you can expand in each direction while all other characters match.
Step-by-step walkthrough
Let's trace through s = "abc" and t = "axcd" to see how fixing the mismatch and expanding works.
Step 1: Fix the mismatch at s[1]="b" and t[1]="x"
Start with just the differing pair (s[1], t[1]). This gives one valid substring pair: "b" vs "x". Count left = 1, right = 1.
Step 2: Expand left while characters match
s[0]="a" == t[0]="a", so we can expand one step to the left. Now "ab" vs "ax" is another valid pair. left = 2.
Step 3: Expand right while characters match
s[2]="c" == t[2]="c", so we can expand one step to the right. Now "abc" vs "axc" is valid too. right = 2.
Step 4: Count from this mismatch point
For each mismatch position, multiply the number of left expansion choices by the number of right expansion choices.
Step 5: Move to the next mismatch: s[0]="a" vs t[1]="x"
Different starting positions give different mismatch points. Here s[0] != t[1], left = 1, right = 1. Pairs = 1 * 1 = 1.
Step 6: Repeat for every (i, j) where s[i] != t[j]
Sum over all mismatch positions to get the total count.
For the full solution, you iterate over every pair (i, j) and, when s[i] != t[j], compute the left and right expansion counts and add their product to the answer.
The code
def count_substrings(s, t):
m, n = len(s), len(t)
count = 0
for i in range(m):
for j in range(n):
if s[i] != t[j]:
left = 1
right = 1
while i - left >= 0 and j - left >= 0 and s[i - left] == t[j - left]:
left += 1
while i + right < m and j + right < n and s[i + right] == t[j + right]:
right += 1
count += left * right
return count
The outer loops pick every pair of positions (i, j). When those characters differ, the inner while loops expand outward. The left variable starts at 1 (counting the mismatch pair itself as one choice) and increments for each matching character going left. Similarly for right. Their product gives the number of valid substring pairs anchored at this mismatch.
The left and right counters both start at 1, not 0. That is because the mismatch pair (s[i], t[j]) by itself, with no expansion in either direction, is already one valid substring pair. Each extra matching character in either direction adds one more expansion choice.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Brute force (enumerate all pairs) | O(m * n * min(m, n)) | O(1) |
| Fix mismatch + expand | O(m * n * min(m, n)) worst case | O(1) |
| DP with precomputed left/right arrays | O(m * n) | O(m * n) |
Where m = length of s, n = length of t.
The "fix mismatch + expand" approach has the same worst-case complexity as brute force, but in practice it is faster because the inner while loops terminate early. You can push this to true O(m * n) by precomputing a 2D array match[i][j] storing the number of consecutive matching characters starting at (i, j), then deriving the left and right expansion counts in constant time per mismatch. For the constraint sizes in this problem (up to 100), the expand approach runs comfortably.
The building blocks
Fixing the pivot position
Many substring and subarray counting problems become simpler when you fix one element of the structure and count around it. Here, fixing the single mismatch position turns a messy three-loop enumeration into a clean product of two expansion counts. You see the same "fix the center" technique in:
- Counting palindromic substrings by expanding from each center
- Subarray problems where you fix the maximum element and count valid subarrays around it
Left-right expansion counting
Once the pivot is fixed, the problem reduces to counting how far you can extend matching characters in both directions. This left-right expansion technique shows up in string matching, palindrome checking, and rolling hash problems. The key is that the number of valid configurations is the product of the choices on each side, not the sum.
Edge cases
- Single-character strings:
s = "a",t = "b". The only pair is("a", "b"), which differs at one position. Answer: 1. Ifs = "a",t = "a", no pair differs, so the answer is 0. - Identical strings: if
s == t, no substring pair of the same content can differ at exactly one position. But substrings at different starting indices can still produce valid pairs. For example,s = "ab",t = "ab"still returns 0 because every same-length pair at the same offset matches perfectly. - All characters the same:
s = "aaa",t = "aaa". Every character matches everywhere, so there is no mismatch to anchor. The answer is 0. - Completely different characters:
s = "abc",t = "xyz". Every pair(i, j)hass[i] != t[j], but expanding left or right immediately fails because neighbors also differ. Each mismatch contributes1 * 1 = 1, and there arem * nmismatches, so the answer ism * n.
From understanding to recall
You now understand why fixing the mismatch and expanding outward works, and how the product of left and right choices counts the valid substring pairs. But knowing the idea and writing the code from scratch under time pressure are two different skills. The tricky details are remembering that left and right start at 1, not 0, and that the outer loops cover every (i, j) while the inner loops only run when characters differ.
This is a pattern that benefits from repetition. The "fix the pivot, expand outward, multiply" technique shows up across string problems, and drilling it with spaced repetition until the code flows naturally from the idea is the most reliable way to internalize it for interviews.
Related posts
- Edit Distance - Another two-string problem where you compare characters position by position, but with a full 2D DP table.
- Longest Common Subsequence - The foundational two-string DP problem. Understanding how characters align between two strings is the same skill used here.
- Permutation in String - A sliding window approach to matching substrings with frequency comparison, a different angle on the substring matching family.