Find All Good Strings: Digit DP Meets KMP
You are given two strings s1 and s2 of length n and a string evil. Return the number of "good" strings of length n that are lexicographically between s1 and s2 (inclusive) and do not contain evil as a substring. Return the count modulo 10^9 + 7.
This is LeetCode 1397: Find All Good Strings. It fuses two techniques that rarely appear together: digit DP (for counting values in a bounded range) and the KMP failure function (for tracking partial substring matches). The result is a problem that tests whether you can compose two well-known building blocks into a single coherent solution.
Why this problem matters
This problem combines two powerful techniques: digit DP (counting values in a range with constraints) and KMP string matching (tracking partial matches efficiently). Seeing them work together deepens your understanding of both.
Digit DP on its own appears in problems where you count numbers or strings within a range that satisfy some property. KMP on its own appears in substring search problems. But in this problem, the "property" you are checking is the absence of a substring, and the most efficient way to track that property while building the string character by character is through the KMP automaton. You need both tools at once.
The key insight
Use digit DP with KMP state. At each character position, you track:
- Whether you are still "tight" against the lower bound s1
- Whether you are still "tight" against the upper bound s2
- How many characters of evil you have matched so far (KMP state)
If the KMP state ever reaches len(evil), that string contains evil and is invalid. Otherwise, after placing all n characters, it counts as a good string.
The tight flags work the same way as in any digit DP problem. When tight_lo is true, the character you place must be >= s1[pos]. When tight_hi is true, it must be <= s2[pos]. Once you pick a character strictly inside the bounds, the corresponding tight flag turns off and all future positions are unconstrained on that side.
The solution
def find_good_strings(n: int, s1: str, s2: str, evil: str) -> int:
MOD = 10**9 + 7
m = len(evil)
fail = [0] * m
k = 0
for i in range(1, m):
while k > 0 and evil[k] != evil[i]:
k = fail[k - 1]
if evil[k] == evil[i]:
k += 1
fail[i] = k
def kmp_next(state: int, ch: str) -> int:
while state > 0 and evil[state] != ch:
state = fail[state - 1]
if evil[state] == ch:
state += 1
return state
memo = {}
def dp(pos: int, tight_lo: bool, tight_hi: bool, kmp_state: int) -> int:
if kmp_state == m:
return 0
if pos == n:
return 1
key = (pos, tight_lo, tight_hi, kmp_state)
if key in memo:
return memo[key]
lo = ord(s1[pos]) if tight_lo else ord('a')
hi = ord(s2[pos]) if tight_hi else ord('z')
total = 0
for c in range(lo, hi + 1):
ch = chr(c)
new_kmp = kmp_next(kmp_state, ch)
new_tight_lo = tight_lo and (c == lo)
new_tight_hi = tight_hi and (c == hi)
total += dp(pos + 1, new_tight_lo, new_tight_hi, new_kmp)
total %= MOD
memo[key] = total
return total
return dp(0, True, True, 0)
The KMP failure function is precomputed once and used inside the DP transitions. At each position, calling kmp_next(state, ch) tells you the new match length after appending character ch. If the new state equals len(evil), you have formed the evil substring and that branch contributes zero.
Visual walkthrough
Step 1: Build the KMP failure function for evil
For evil = "b" (length 1), the failure array is simply [0]. For longer evil strings, the failure function tells us where to resume matching after a mismatch, avoiding redundant comparisons.
Step 2: Define the DP state
Each state tracks four things: how far along the string we are (position), whether we are still constrained by the lower bound, whether we are still constrained by the upper bound, and how many characters of evil we have matched.
Step 3: At each position, try all valid characters
The character range depends on the tight flags. If tight_lo is true, you cannot go below s1[pos]. If tight_hi is true, you cannot go above s2[pos]. For each character, update the KMP state using the failure function.
Step 4: Sum all valid complete paths
After filling all n positions, count every path where kmp_match never reached len(evil). Those are the good strings. Return the total modulo 10^9 + 7.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Digit DP + KMP | O(n * m * 26) | O(n * m) |
Where n = string length, m = evil length.
At each of the n positions, you have 2 possible tight_lo values, 2 possible tight_hi values, and m possible KMP states. That gives O(n * 4 * m) unique DP states. For each state, you try up to 26 characters. Each character transition runs the KMP next-state function in amortized O(1). The total time is O(n * m * 26), and the memoization table uses O(n * m * 4) space, which simplifies to O(n * m).
The building blocks
1. KMP failure function
The KMP failure function (also called the prefix function) preprocesses the pattern evil to build an array fail where fail[i] is the length of the longest proper prefix of evil[0..i] that is also a suffix. This lets you efficiently compute the next KMP state when you append a new character. Instead of restarting the match from scratch after a mismatch, you fall back to the longest partial match that still works.
2. Digit DP framework
Digit DP is a technique for counting numbers (or strings) in a range [lo, hi] that satisfy a property. You build the string one character at a time, left to right. At each position, two boolean flags track whether you are "tight" against the lower and upper bounds. When tight, the character at that position is constrained. When not tight, you can place any character from 'a' to 'z'.
3. State composition
The power of this problem comes from composing the KMP state into the digit DP state. Instead of the usual digit DP state (pos, tight_lo, tight_hi), you add a fourth dimension: kmp_state. This dimension tracks how many characters of evil you have matched so far. The two systems run in parallel. The digit DP handles the range constraints. The KMP automaton handles the substring avoidance constraint. Together, they count exactly the strings that are in range and do not contain evil.
Edge cases
- evil longer than n. If
len(evil) > n, then no string of lengthncan containevil. Every string in the range is good, and you just need to count strings betweens1ands2. The DP handles this automatically becausekmp_statecan never reachm. - s1 equals s2. There is exactly one candidate string. Check whether it contains
evil. The DP handles this because both tight flags stay true throughout, forcing every character to match exactly. - evil is a single character. The KMP failure array is
[0], and the KMP state is either 0 (no match) or 1 (full match). At each position, you skip the one forbidden character and place any of the remaining 25. - All characters in s1 and s2 are the same. The tight constraints limit your choices severely. If
s1 = "aaaa"ands2 = "aaaa", the only candidate is"aaaa"and you check it againstevil. - evil appears at the boundary. A string might match
evilstarting at the very last few positions. The KMP state tracks this correctly because it accumulates partial matches across the entire string.
From understanding to recall
You now understand how digit DP and KMP combine: the digit DP framework handles range counting while the KMP automaton tracks substring matches as an extra state dimension. But reproducing this under pressure requires remembering several details at once. You need the KMP failure function construction, the kmp_next transition, the digit DP tight-flag logic, and the early termination when kmp_state == m.
The core pattern here, augmenting digit DP with an automaton state, generalizes beyond substring avoidance. Any time you need to count strings in a range that avoid (or require) a pattern, you can plug a different automaton into the same framework. Mastering this composition is what separates knowing two techniques from being able to combine them on demand.
Digit DP with KMP is one of roughly 60 reusable building blocks in the CodeBricks system. You drill each block individually with spaced repetition until writing the solution from scratch is automatic. Once you can reproduce the KMP failure function and the digit DP skeleton without looking, combining them becomes a matter of plugging one into the other.
Related posts
- Regular Expression Matching - Another problem where you track automaton states during string processing
- Longest Happy Prefix - A direct application of the KMP prefix function
- Distinct Subsequences - Counting problems on strings using dynamic programming with multiple state dimensions