Skip to content
← All posts

Word Break: Dynamic Programming on Strings

7 min read
leetcodeproblemmediumdynamic-programming

You are given a string s and a list of dictionary words. Can you split the string into a sequence of one or more dictionary words, with no characters left over?

This is LeetCode 139: Word Break, and it is one of the most popular string segmentation DP problems out there. If you have solved Coin Change or Climbing Stairs, you already have the right mental model. Word Break applies the same "build up from smaller subproblems" idea, just on a string instead of a number line.

0l1e2e3t4c5o6d7eleetcode
"leetcode" can be segmented into "leet" + "code", both found in the dictionary.

Why this problem matters

Word Break is the bridge between numeric DP and string DP. It teaches you how to:

  • Think about substrings as subproblems: instead of "can I make amount 6?", you ask "can I segment the first 4 characters?"
  • Use a boolean DP table: not every DP problem tracks a minimum or maximum. Here you track reachability.
  • Try multiple split points: at each position, you consider every dictionary word that could end there.

Once you get comfortable with this pattern, problems like Palindrome Partitioning and Word Break II become much more approachable.

The brute force idea

At every position in the string, try every word in the dictionary. If the word matches the current substring, recurse on the rest. If any recursive path successfully reaches the end of the string, return True.

def word_break(s, word_dict):
    def helper(start):
        if start == len(s):
            return True
        for word in word_dict:
            end = start + len(word)
            if end <= len(s) and s[start:end] == word:
                if helper(end):
                    return True
        return False

    return helper(0)

This works for small inputs, but the recursion tree can branch wildly. Consider s = "aaaaaab" with word_dict = ["a", "aa", "aaa"]. Every prefix of a's matches, so the recursion tries an exponential number of paths before discovering that no path can handle the trailing "b". The time complexity can reach O(2^n) in the worst case.

The worst case for brute force happens when many overlapping prefixes match but no valid segmentation exists. The recursion explores every possible combination of splits before giving up. This is the classic signal that memoization or bottom-up DP will help.

Adding memoization (top-down DP)

The key observation: helper(start) only depends on the index start. If you have already figured out whether the substring from start to the end can be segmented, you do not need to recompute it.

def word_break(s, word_dict):
    memo = {}

    def helper(start):
        if start == len(s):
            return True
        if start in memo:
            return memo[start]

        for word in word_dict:
            end = start + len(word)
            if end <= len(s) and s[start:end] == word:
                if helper(end):
                    memo[start] = True
                    return True

        memo[start] = False
        return False

    return helper(0)

Now each starting index is solved at most once. Time: O(n * m * k) where n is the length of the string, m is the number of words in the dictionary, and k is the maximum word length (for substring comparison). Space: O(n) for the memo table plus call stack. Much better than exponential.

Building the DP table (bottom-up)

Instead of recursing from the beginning of the string forward, build up from position 0. Define dp[i] as: "can the substring s[0:i] be segmented into dictionary words?"

The recurrence:

  • dp[0] = True (the empty prefix can always be segmented)
  • For each position i from 1 to len(s), check every word in the dictionary. If word has length w, and s[i-w:i] equals word, and dp[i-w] is True, then set dp[i] = True
  • The answer is dp[len(s)]

Let's trace through s = "leetcode" and word_dict = ["leet", "code"]:

Base case: dp[0] = True

0T1F2F3F4F5F6F7F8F

An empty string can always be segmented. dp[0] = True.

dp[1] through dp[3]: No dictionary word ends here

0T1F2F3F4F5F6F7F8F

"l", "le", "lee" are not in the dictionary, and no shorter prefix works either. All stay False.

dp[4]: s[0:4] = "leet" is in the dictionary, and dp[0] is True!

0T1F2F3F4T5F6F7F8F"leet"

We found a word that ends at index 4. Since dp[0] is True, everything before it was valid. dp[4] = True.

dp[5] through dp[7]: No valid segmentation ends here

0T1F2F3F4T5F6F7F8F

"leetc", "leetco", "leetcod" cannot be segmented. For each position, no dictionary word ending there has a True dp value before it.

dp[8]: s[4:8] = "code" is in the dictionary, and dp[4] is True!

0T1F2F3F4T5F6F7F8T"code"

The substring from index 4 to 8 is "code", which is in the dictionary. dp[4] is True, so dp[8] = True. The string can be segmented!

The logic at each step is simple: for the current position i, look at every dictionary word. If a word fits (it ends exactly at position i and the characters match), check whether dp[i - len(word)] is True. If it is, that means everything before this word was already a valid segmentation, so now everything up through position i is valid too.

def word_break(s, word_dict):
    n = len(s)
    dp = [False] * (n + 1)
    dp[0] = True

    for i in range(1, n + 1):
        for word in word_dict:
            w = len(word)
            if w <= i and dp[i - w] and s[i - w:i] == word:
                dp[i] = True
                break

    return dp[n]

Time: O(n * m * k) where n is string length, m is dictionary size, and k is max word length.

Space: O(n) for the DP array.

Notice the break after setting dp[i] = True. Once you know position i is reachable, you do not need to check more words for that position.

Compare this to Coin Change. In Coin Change, dp[i] = min(dp[i - coin] + 1) over all coins, and you always check every coin because you want the minimum. In Word Break, dp[i] is a boolean. Once any word makes it True, you can stop early. The DP table structure is the same, but the transition logic is simpler.

Optimization: use a set for faster lookups

If the dictionary is large, iterating through every word for every position gets expensive. Convert the dictionary to a set so you can check membership in O(k) time. Also, cap your inner loop at the length of the longest word.

def word_break(s, word_dict):
    word_set = set(word_dict)
    max_len = max(len(w) for w in word_dict)
    n = len(s)
    dp = [False] * (n + 1)
    dp[0] = True

    for i in range(1, n + 1):
        for j in range(max(0, i - max_len), i):
            if dp[j] and s[j:i] in word_set:
                dp[i] = True
                break

    return dp[n]

Instead of iterating over dictionary words, iterate over possible starting positions j for the last word. The inner loop now runs at most max_len times per position, which is often much smaller than the dictionary size.

Time: O(n * max_len * max_len) where the inner max_len accounts for substring hashing. In practice this is fast.

Space: O(n + total characters in dictionary) for the DP array and the set.

Complexity analysis

ApproachTimeSpace
Brute force recursionO(2^n) worst caseO(n) call stack
Memoized recursionO(n * m * k)O(n) memo + stack
Bottom-up DPO(n * m * k)O(n) array
Set-optimized DPO(n * L * L)O(n + dict)

Where n = string length, m = number of words, k = max word length, L = max word length.

Edge cases to watch for

  • Empty string: dp[0] = True, return True immediately.
  • No valid segmentation: e.g., s = "catsandog" with word_dict = ["cats", "dog", "sand", "and", "cat"]. The substrings "cats" and "and" leave "og" which is not a word. Return False.
  • Overlapping words: s = "cars" with word_dict = ["car", "ca", "rs"]. "car" leaves "s" (no match), but "ca" + "rs" works. The DP approach handles this because it checks every possible split.
  • Single character words: the inner loop just checks each character individually.
  • Repeated characters: s = "aaaaaaa" with word_dict = ["a", "aa"]. Many valid segmentations exist. DP only needs to find one.

The building blocks

Word Break is built on linear DP with constant state, the same fundamental pattern as Climbing Stairs and Coin Change. The structure is identical:

  • State: dp[i] = can s[0:i] be segmented?
  • Transition: dp[i] = True if any word in the dictionary ends at position i and dp[i - len(word)] is True
  • Base case: dp[0] = True

The parallel to Coin Change is almost exact. In Coin Change, you try every coin denomination at each amount. In Word Break, you try every dictionary word at each position. The "coins" are the words, and the "amount" is the string length.

This building block appears in related problems:

  • Word Break II (LeetCode 140): same DP structure, but you also need to reconstruct all valid segmentations. Add backtracking on top of the boolean DP.
  • Palindrome Partitioning (LeetCode 131): instead of dictionary words, the "valid pieces" are palindromic substrings. Same idea of checking whether a prefix is valid and whether what follows can also be split.
  • Concatenated Words (LeetCode 472): run Word Break on each word using the other words as the dictionary. Same DP inside a loop.

From understanding to recall

You have seen how Word Break is really just Coin Change wearing a string costume. The DP table, the transition logic, the base case: all structurally the same. But recognizing that in an interview, under pressure, with a blank editor, is a different skill than recognizing it while reading a blog post.

Spaced repetition is how you turn recognition into recall. You practice writing the DP recurrence from scratch at increasing intervals, and after a few reps you stop thinking about it. The linear DP with constant state pattern is one of roughly 60 reusable building blocks in the CodeBricks system. Drilling them individually is far more effective than grinding random problems.

Related posts

  • Coin Change - The classic unbounded knapsack DP, same building block with numeric amounts instead of string positions
  • Climbing Stairs - The gentlest intro to linear DP, same pattern with a fixed lookback of 1 or 2