Skip to content
← All posts

Wildcard Matching: DP with ? and * Patterns

8 min read
leetcodeproblemhardstringsdynamic-programming

You are given an input string s and a pattern p. Implement wildcard pattern matching with support for two special characters:

  • '?' matches any single character.
  • '*' matches any sequence of characters, including the empty sequence.

The matching should cover the entire input string, not just a substring.

This is LeetCode 44: Wildcard Matching, a hard problem that tests your ability to think about string matching with dynamic programming. If you have solved Regular Expression Matching, you will notice structural similarities, but the '*' wildcard here behaves differently. In regex matching, '*' modifies the preceding character. In wildcard matching, '*' is standalone and absorbs any sequence by itself.

""*a*b""adcebTTFFFFTTTFFTFTFFTFTFFTFTFFTFTT* +1a==ab==bTrue (match)Star absorbs charsFinal answerCharacter match
2D DP table for s="adceb", p="*a*b". True cells are highlighted. Each '*' can absorb any sequence of characters. Answer: dp[5][4] = True.

Examples

  • s = "aa", p = "a" returns false. The pattern is a single "a", which does not cover the full string.
  • s = "aa", p = "*" returns true. "*" matches any sequence, including "aa".
  • s = "cb", p = "?a" returns false. '?' matches "c", but "a" does not match "b".
  • s = "adceb", p = "*a*b" returns true. The first '*' matches empty, "a" matches "a", the second '*' matches "dce", and "b" matches "b".

Why this problem matters

Wildcard matching shows up everywhere outside of interviews. Shell globbing (*.txt, test_??.py), file search tools, and configuration filters all use '?' and '*' with exactly the semantics described here. Understanding how to match these patterns efficiently gives you insight into how these tools work under the hood.

In interviews, this problem tests whether you can set up a 2D boolean DP table, handle base cases for the empty string, and reason about the '*' wildcard's dual nature (match zero characters or match one more). The '?' case is simple. The '*' case is what separates candidates who understand 2D string DP from those who are guessing.

The brute force idea

Start from the beginning of both strings. At each position:

  • If p[j] == '?', it matches any single character. Advance both pointers.
  • If p[j] == '*', you have two choices: skip the star (match zero characters) or consume one character from s and keep the star active (match one more character). This branching is what creates exponential time.
  • If p[j] is a regular character and it matches s[i], advance both pointers.
  • Otherwise, no match.
def is_match_recursive(s, p, i, j):
    if j == len(p):
        return i == len(s)
    if i == len(s):
        return all(c == "*" for c in p[j:])
    if p[j] == "*":
        return (is_match_recursive(s, p, i, j + 1) or
                is_match_recursive(s, p, i + 1, j))
    if p[j] == "?" or p[j] == s[i]:
        return is_match_recursive(s, p, i + 1, j + 1)
    return False

This is exponential in the worst case. A pattern like "*a*b*c*d" branches at every star, and the same (i, j) subproblems get recomputed many times. You need a table.

The recursive version captures the logic cleanly, but do not submit it without memoization. For strings where multiple stars appear in the pattern, the branching blows up. The bottom-up DP approach fills the table in O(m * n) and avoids recursion limits entirely.

Building the 2D DP table

Define dp[i][j] as whether s[0..i-1] matches p[0..j-1]. The table has (m + 1) rows and (n + 1) columns, where m = len(s) and n = len(p).

The recurrence has three cases:

  1. Character match (p[j-1] == s[i-1] or p[j-1] == '?'): take the diagonal value. dp[i][j] = dp[i-1][j-1].
  2. Star in the pattern (p[j-1] == '*'): two sub-cases:
    • Match zero characters: skip the star entirely. dp[i][j] = dp[i][j-1].
    • Match one more character: the star absorbs s[i-1]. dp[i][j] = dp[i-1][j].
    • dp[i][j] is True if either sub-case is True.
  3. No match: dp[i][j] = False.

Base cases:

  • dp[0][0] = True. Empty string matches empty pattern.
  • dp[0][j]: the empty string can only match a pattern of all stars. For each j where p[j-1] == '*', set dp[0][j] = dp[0][j-1]. The moment you hit a non-star character, the rest of the row stays False.
  • dp[i][0] = False for i > 0. A non-empty string never matches an empty pattern.

Let's trace through s = "adceb" and p = "*a*b":

Initialize: base cases for the empty string vs pattern

""*a*b""adcebTTFFFFFFFFFFFFFFFFFFFFFFFFFFFF

dp[0][0] = True (empty matches empty). dp[0][1] = True because "*" matches zero characters. dp[0][2] through dp[0][4] are False because "a", "a*", and "a*b" cannot match an empty string.

Row 1 (s="a"): the leading "*" absorbs "a", then "a" matches "a"

""*a*b""adcebTTFFFFTTTFFFFFFFFFFFFFFFFFFFFF* +1a==a* zero

dp[1][1]: p[0]="*", so check dp[0][1]=T (absorb one more). True. dp[1][2]: p[1]="a"==s[0]="a", diagonal dp[0][1]=T. True. dp[1][3]: p[2]="*", check dp[1][2]=T (absorb zero). True.

Rows 2-4 (s="ad", "adc", "adce"): the stars keep absorbing

""*a*b""adcebTTFFFFTTTFFTFTFFTFTFFTFTFFFFFF* +1* +1* +1

Column 1 (first "*") and column 3 (second "*") stay True by absorbing each new character. dp[2][2]=F because p[1]="a" does not match s[1]="d". The stars act as bridges.

Row 5 (s="adceb"): "b" matches the final "b" in the pattern. Done!

""*a*b""adcebTTFFFFTTTFFTFTFFTFTFFTFTFFTFTTb==b

dp[5][4]: p[3]="b" matches s[4]="b", so check diagonal dp[4][3]=True. The full string "adceb" matches "*a*b". Answer: True.

The answer is dp[5][4] = True. The first '*' matches empty, "a" matches "a", the second '*' absorbs "dce", and "b" matches "b".

def is_match(s, p):
    m, n = len(s), len(p)
    dp = [[False] * (n + 1) for _ in range(m + 1)]
    dp[0][0] = True

    for j in range(1, n + 1):
        if p[j - 1] == "*":
            dp[0][j] = dp[0][j - 1]

    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if p[j - 1] == s[i - 1] or p[j - 1] == "?":
                dp[i][j] = dp[i - 1][j - 1]
            elif p[j - 1] == "*":
                dp[i][j] = dp[i][j - 1] or dp[i - 1][j]

    return dp[m][n]

Time: O(m * n). Every cell in the 2D table gets filled exactly once.

Space: O(m * n). The full table.

The key difference from Regular Expression Matching (LeetCode 10): in regex, '*' modifies the preceding character, so the zero-occurrence branch looks two columns back (dp[i][j-2]). In wildcard matching, '*' is standalone, so the zero-occurrence branch looks one column back (dp[i][j-1]). The one-more-occurrence branch is the same in both: dp[i-1][j].

Walkthrough of the key transitions

Let's trace the critical cells for s = "adceb", p = "*a*b":

  • dp[0][1] (empty string vs "*"): p[0] == '*', so check dp[0][0] = True. The star matches zero characters. dp[0][1] = True.
  • dp[1][1] ("a" vs "*"): p[0] == '*'. Zero chars: dp[1][0] = False. One more: dp[0][1] = True. So dp[1][1] = True. The star absorbed "a".
  • dp[1][2] ("a" vs "*a"): p[1] == 'a' and s[0] == 'a', characters match. Diagonal: dp[0][1] = True. So dp[1][2] = True.
  • dp[1][3] ("a" vs "*a*"): p[2] == '*'. Zero chars: dp[1][2] = True. So dp[1][3] = True. The second star matches zero characters here.
  • dp[2][3] ("ad" vs "*a*"): p[2] == '*'. Zero chars: dp[2][2] = False. One more: dp[1][3] = True. So dp[2][3] = True. The second star absorbed "d".
  • dp[3][3], dp[4][3] follow the same pattern. The second star keeps absorbing "c" and "e" by checking dp[i-1][3].
  • dp[5][4] ("adceb" vs "*a*b"): p[3] == 'b' and s[4] == 'b', characters match. Diagonal: dp[4][3] = True. So dp[5][4] = True.

Notice the column of True values at index 1 (the first star) and index 3 (the second star). Stars create vertical "highways" of True values, absorbing one character per row. The literal characters "a" and "b" act as checkpoints that must be crossed diagonally.

Complexity analysis

ApproachTimeSpace
Brute force recursionExponentialO(m + n) call stack
2D DP tableO(m * n)O(m * n)

Where m = length of s, n = length of p.

You can optimize space to O(n) using a single-row approach, since each row only depends on the current row and the row above. The logic is the same as in Edit Distance: save the diagonal value before it gets overwritten.

Edge cases to watch for

  • Empty string, empty pattern: dp[0][0] = True. Both are empty, so they match.
  • Empty string, pattern is all stars: e.g., s = "", p = "***". Each star propagates dp[0][j] = dp[0][j-1], so the entire base row stays True. Answer is True.
  • No wildcards at all: e.g., s = "abc", p = "abc". Every cell uses the diagonal character-match case. The DP reduces to character-by-character comparison.
  • Pattern is just "*": matches any string, including the empty string.
  • Consecutive stars: "***" behaves identically to a single "*". Each star sets dp[i][j] = dp[i][j-1] or dp[i-1][j], propagating True values through.
  • Question mark with empty string: s = "", p = "?" returns False. The '?' requires exactly one character.
  • All question marks: s = "abc", p = "???" returns True only if len(s) == len(p).

The building blocks

This problem is built on two core patterns:

2D string DP (two-string comparison): The same table structure you see in Edit Distance and Regular Expression Matching. Define dp[i][j] for prefixes of both inputs, fill the table row by row, and read the answer from dp[m][n]. The difference here is that cells store booleans, and the star branch has the simpler dp[i][j-1] / dp[i-1][j] form.

Wildcard absorption: The '*' operator introduces a "skip or extend" branching structure. The zero-character case (dp[i][j-1]) means the star matches nothing and you move on. The one-more-character case (dp[i-1][j]) means the star absorbs the current character and remains active for the next row. This creates the vertical "highway" of True values you see in the DP table. The same absorption pattern appears in any problem where a single operator can consume a variable number of elements.

Recognizing that the star decomposes into exactly two sub-cases (zero or one more) is the key insight. There is no need to loop over how many characters the star matches. The table propagates the information one character at a time.

From understanding to recall

You now know how to build the boolean DP table, handle the three cases (character match or '?', star, no match), and trace through the star absorption logic. But reading this once is not the same as writing it cold. In an interview, you need to set up the (m+1) x (n+1) table, get the base cases right for the leading stars, and write the star recurrence without confusing dp[i][j-1] (zero chars) with dp[i-1][j] (one more char).

The difference between this problem and Regular Expression Matching is subtle but critical. In regex, '*' looks back two columns (dp[i][j-2]) because it erases a two-character group. In wildcard matching, '*' looks back one column (dp[i][j-1]) because it stands alone. Mixing up which problem uses which recurrence is a common mistake under pressure, and the only cure is repeated practice.

The 2D string DP and wildcard absorption building blocks together cover the family of pattern matching problems. Locking in both wildcard matching and regex matching gives you two templates that handle any interview variant. These building blocks are part of the roughly 60 reusable patterns in the CodeBricks system. You drill each block with spaced repetition until writing the solution from scratch is automatic.

Related posts

  • Regular Expression Matching - The sister problem using '.' and '*' with preceding-element semantics. Same table structure, different star recurrence.
  • Edit Distance - The classic 2D string DP problem. Same table layout but with integer costs instead of booleans.