Skip to content
← All posts

Find All Good Strings: Digit DP Meets KMP

6 min read
leetcodeproblemhardstringsdynamic-programming

You are given two strings s1 and s2 of length n and a string evil. Return the number of "good" strings of length n that are lexicographically between s1 and s2 (inclusive) and do not contain evil as a substring. Return the count modulo 10^9 + 7.

This is LeetCode 1397: Find All Good Strings. It fuses two techniques that rarely appear together: digit DP (for counting values in a bounded range) and the KMP failure function (for tracking partial substring matches). The result is a problem that tests whether you can compose two well-known building blocks into a single coherent solution.

n=2, s1="aa", s2="da", evil="b"All strings from aa to da:aagoodabcontains bacgoodadgoodbacontains bbbcontains bbccontains bbdcontains bcagoodcbcontains bccgoodcdgooddagoodGood string (no evil substring)Contains evil, excludedAnswer: 7 good strings
Strings of length 2 between "aa" and "da" (inclusive). Strings containing "b" are crossed out. 7 good strings remain.

Why this problem matters

This problem combines two powerful techniques: digit DP (counting values in a range with constraints) and KMP string matching (tracking partial matches efficiently). Seeing them work together deepens your understanding of both.

Digit DP on its own appears in problems where you count numbers or strings within a range that satisfy some property. KMP on its own appears in substring search problems. But in this problem, the "property" you are checking is the absence of a substring, and the most efficient way to track that property while building the string character by character is through the KMP automaton. You need both tools at once.

The key insight

Use digit DP with KMP state. At each character position, you track:

  • Whether you are still "tight" against the lower bound s1
  • Whether you are still "tight" against the upper bound s2
  • How many characters of evil you have matched so far (KMP state)

If the KMP state ever reaches len(evil), that string contains evil and is invalid. Otherwise, after placing all n characters, it counts as a good string.

The tight flags work the same way as in any digit DP problem. When tight_lo is true, the character you place must be >= s1[pos]. When tight_hi is true, it must be <= s2[pos]. Once you pick a character strictly inside the bounds, the corresponding tight flag turns off and all future positions are unconstrained on that side.

The solution

def find_good_strings(n: int, s1: str, s2: str, evil: str) -> int:
    MOD = 10**9 + 7
    m = len(evil)

    fail = [0] * m
    k = 0
    for i in range(1, m):
        while k > 0 and evil[k] != evil[i]:
            k = fail[k - 1]
        if evil[k] == evil[i]:
            k += 1
        fail[i] = k

    def kmp_next(state: int, ch: str) -> int:
        while state > 0 and evil[state] != ch:
            state = fail[state - 1]
        if evil[state] == ch:
            state += 1
        return state

    memo = {}

    def dp(pos: int, tight_lo: bool, tight_hi: bool, kmp_state: int) -> int:
        if kmp_state == m:
            return 0
        if pos == n:
            return 1

        key = (pos, tight_lo, tight_hi, kmp_state)
        if key in memo:
            return memo[key]

        lo = ord(s1[pos]) if tight_lo else ord('a')
        hi = ord(s2[pos]) if tight_hi else ord('z')

        total = 0
        for c in range(lo, hi + 1):
            ch = chr(c)
            new_kmp = kmp_next(kmp_state, ch)
            new_tight_lo = tight_lo and (c == lo)
            new_tight_hi = tight_hi and (c == hi)
            total += dp(pos + 1, new_tight_lo, new_tight_hi, new_kmp)

        total %= MOD
        memo[key] = total
        return total

    return dp(0, True, True, 0)

The KMP failure function is precomputed once and used inside the DP transitions. At each position, calling kmp_next(state, ch) tells you the new match length after appending character ch. If the new state equals len(evil), you have formed the evil substring and that branch contributes zero.

Visual walkthrough

Step 1: Build the KMP failure function for evil

evil = "b"Index:0Fail:0char: "b"fail[0] = 0 (no proper prefix match)

For evil = "b" (length 1), the failure array is simply [0]. For longer evil strings, the failure function tells us where to resume matching after a mismatch, avoiding redundant comparisons.

Step 2: Define the DP state

State = (pos, tight_lo, tight_hi, kmp_match)pos: 0..n-1 (character position we are filling)tight_lo, tight_hi: are we clamped to s1/s2 bounds?kmp_match: 0..len(evil)-1 (partial match length into evil)

Each state tracks four things: how far along the string we are (position), whether we are still constrained by the lower bound, whether we are still constrained by the upper bound, and how many characters of evil we have matched.

Step 3: At each position, try all valid characters

Example: pos=0, s1="aa", s2="da", evil="b"tight_lo=true, tight_hi=trueValid chars: "a" to "d" (constrained by s1[0] and s2[0])akmp=0okbkmp=1FULL!ckmp=0okdkmp=0okWhen kmp_match reaches len(evil), the string contains evil. Stop that path.

The character range depends on the tight flags. If tight_lo is true, you cannot go below s1[pos]. If tight_hi is true, you cannot go above s2[pos]. For each character, update the KMP state using the failure function.

Step 4: Sum all valid complete paths

For n=2, s1='aa', s2='da', evil='b':Good strings: aa, ac, ad, ca, cc, cd, daRejected (contain 'b'): ab, ba, bb, bc, bd, cbAnswer: 7 (mod 10^9 + 7)

After filling all n positions, count every path where kmp_match never reached len(evil). Those are the good strings. Return the total modulo 10^9 + 7.

Complexity analysis

ApproachTimeSpace
Digit DP + KMPO(n * m * 26)O(n * m)

Where n = string length, m = evil length.

At each of the n positions, you have 2 possible tight_lo values, 2 possible tight_hi values, and m possible KMP states. That gives O(n * 4 * m) unique DP states. For each state, you try up to 26 characters. Each character transition runs the KMP next-state function in amortized O(1). The total time is O(n * m * 26), and the memoization table uses O(n * m * 4) space, which simplifies to O(n * m).

The building blocks

1. KMP failure function

The KMP failure function (also called the prefix function) preprocesses the pattern evil to build an array fail where fail[i] is the length of the longest proper prefix of evil[0..i] that is also a suffix. This lets you efficiently compute the next KMP state when you append a new character. Instead of restarting the match from scratch after a mismatch, you fall back to the longest partial match that still works.

2. Digit DP framework

Digit DP is a technique for counting numbers (or strings) in a range [lo, hi] that satisfy a property. You build the string one character at a time, left to right. At each position, two boolean flags track whether you are "tight" against the lower and upper bounds. When tight, the character at that position is constrained. When not tight, you can place any character from 'a' to 'z'.

3. State composition

The power of this problem comes from composing the KMP state into the digit DP state. Instead of the usual digit DP state (pos, tight_lo, tight_hi), you add a fourth dimension: kmp_state. This dimension tracks how many characters of evil you have matched so far. The two systems run in parallel. The digit DP handles the range constraints. The KMP automaton handles the substring avoidance constraint. Together, they count exactly the strings that are in range and do not contain evil.

Edge cases

  • evil longer than n. If len(evil) > n, then no string of length n can contain evil. Every string in the range is good, and you just need to count strings between s1 and s2. The DP handles this automatically because kmp_state can never reach m.
  • s1 equals s2. There is exactly one candidate string. Check whether it contains evil. The DP handles this because both tight flags stay true throughout, forcing every character to match exactly.
  • evil is a single character. The KMP failure array is [0], and the KMP state is either 0 (no match) or 1 (full match). At each position, you skip the one forbidden character and place any of the remaining 25.
  • All characters in s1 and s2 are the same. The tight constraints limit your choices severely. If s1 = "aaaa" and s2 = "aaaa", the only candidate is "aaaa" and you check it against evil.
  • evil appears at the boundary. A string might match evil starting at the very last few positions. The KMP state tracks this correctly because it accumulates partial matches across the entire string.

From understanding to recall

You now understand how digit DP and KMP combine: the digit DP framework handles range counting while the KMP automaton tracks substring matches as an extra state dimension. But reproducing this under pressure requires remembering several details at once. You need the KMP failure function construction, the kmp_next transition, the digit DP tight-flag logic, and the early termination when kmp_state == m.

The core pattern here, augmenting digit DP with an automaton state, generalizes beyond substring avoidance. Any time you need to count strings in a range that avoid (or require) a pattern, you can plug a different automaton into the same framework. Mastering this composition is what separates knowing two techniques from being able to combine them on demand.

Digit DP with KMP is one of roughly 60 reusable building blocks in the CodeBricks system. You drill each block individually with spaced repetition until writing the solution from scratch is automatic. Once you can reproduce the KMP failure function and the digit DP skeleton without looking, combining them becomes a matter of plugging one into the other.

Related posts