Skip to content
← All posts

Replace Words: Prefix Matching with a Trie

7 min read
leetcodeproblemmediumstringstriehash-map

You are given a list of root words (a dictionary) and a sentence. For every word in the sentence, if any root is a prefix of that word, replace the word with the shortest such root. If multiple roots match, pick the shortest one. If no root matches, leave the word as is.

This is LeetCode 648: Replace Words, and it is a clean application of the trie data structure. If you have solved Implement Trie, you already have all the pieces. Replace Words just adds one twist: instead of checking whether a full word exists, you stop as soon as you hit the first end-of-word marker while walking a word through the trie. That gives you the shortest matching root.

cbraaatttrootcbraaatttend of root wordInput sentencethecattlewasrattledbythebatteryOutput sentencethecatwasratbythebat
A trie built from roots ["cat", "bat", "rat"]. Words like "cattle", "rattled", and "battery" get replaced by their shortest matching root.

Why this problem matters

A brute force approach would check every root against every word in the sentence. If there are d roots and w words, each of average length k, that is O(d * w * k) string comparisons. It works, but it does not scale.

A trie reduces the lookup to O(k) per word. You insert all roots into the trie once, then for each word in the sentence, you walk through the trie character by character. The moment you reach a node marked as end-of-word, you have found the shortest matching root. You do not need to check any other roots because the trie walk naturally finds the shortest prefix first.

This problem also reinforces a key trie concept: the difference between "a prefix exists in the trie" and "a complete root word ends here." The is_end marker is what tells you a path represents an actual root, not just a partial prefix of a longer root. This same idea shows up in Design Add and Search Words and Word Break.

The approach

The algorithm has two phases. First, build a trie from all root words. Second, split the sentence into words and look up each word in the trie.

For each word, walk the trie one character at a time. If you reach a node where is_end is True, return the prefix up to that point. If you reach a character that has no matching child in the trie, the word has no matching root, so return the original word.

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end = False

def replace_words(dictionary, sentence):
    root = TrieNode()

    for word in dictionary:
        node = root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.is_end = True

    def find_root(word):
        node = root
        for i, ch in enumerate(word):
            if node.is_end:
                return word[:i]
            if ch not in node.children:
                return word
            node = node.children[ch]
        if node.is_end:
            return word
        return word

    return " ".join(find_root(w) for w in sentence.split())

The key detail in find_root is checking node.is_end before moving to the next character. This ensures you return the shortest root. If the dictionary contains both "ca" and "cat", walking the word "cattle" will stop at "ca" because that end marker comes first.

Step-by-step walkthrough

Let's trace through dictionary = ["cat", "bat", "rat"] and sentence = "the cattle was rattled by the battery".

First, we build a trie from the three roots. Each root creates a path of three nodes (one per character), with the final node marked as end-of-word.

Then we process each word in the sentence:

  • "the": Start at root, look for child "t". No child exists. Return "the" unchanged.
  • "cattle": Start at root, follow "c", then "a", then "t". At "t", is_end is True. Return "cat".
  • "was": Start at root, look for child "w". No child exists. Return "was" unchanged.
  • "rattled": Start at root, follow "r", then "a", then "t". At "t", is_end is True. Return "rat".
  • "by": Start at root, look for child "b", follow it. Look for "y" in children of "b". No child "y" exists. Return "by" unchanged.
  • "the": Same as before. Return "the" unchanged.
  • "battery": Start at root, follow "b", then "a", then "t". At "t", is_end is True. Return "bat".

Final result: "the cat was rat by the bat"

Step 1: Build the trie from root words ["cat", "bat", "rat"]

cbraaatttrootcbraaattt

Insert each root word into the trie character by character. Mark the last node of each root as end-of-word.

Step 2: Process "cattle" -- walk c -> a -> t (end-of-word found!)

cbraaatttrootcbraaatis_end!tt

Walk "cattle" through the trie: c, a, t. At "t" we hit an end-of-word marker after 3 characters. Replace "cattle" with "cat".

Step 3: Process "rattled" -- walk r -> a -> t (end-of-word found!)

cbraaatttrootcbraaatttis_end!

Walk "rattled" through the trie: r, a, t. At "t" we hit an end-of-word marker after 3 characters. Replace "rattled" with "rat".

Step 4: Process "the" -- no child for "t" at root. Keep the original word.

cbraaatttno "t" childcbraaattt

The first character of "the" is "t". The root has no child for "t" (only c, b, r). No root matches, so keep "the" unchanged.

Step 5: Process "battery" -- walk b -> a -> t (end-of-word found!)

cbraaatttrootcbraaattis_end!t

Walk "battery" through the trie: b, a, t. At "t" we hit an end-of-word marker after 3 characters. Replace "battery" with "bat".

Step 6: Reassemble the sentence with replacements applied.

Input:"the cattle was rattled by the battery"
Output:"the cat was rat by the bat"

Three words had matching roots: "cattle" became "cat", "rattled" became "rat", and "battery" became "bat". The other words had no matching prefix in the trie and stayed unchanged.

Complexity analysis

Let d be the total number of characters across all root words, n be the number of words in the sentence, and k be the average word length.

Complexity
TimeO(d + n * k)
SpaceO(d + n * k)

Time: O(d) to build the trie by inserting all roots. Then O(n * k) to process each word in the sentence, where each word takes at most O(k) to walk through the trie. The total is O(d + n * k).

Space: O(d) for the trie nodes. O(n * k) for storing the output sentence. If you count the input sentence split, that is also O(n * k).

Edge cases to watch for

  • Root is the entire word: if the dictionary contains "cat" and the sentence has "cat", the output should be "cat" (unchanged, since the root is an exact match).
  • Multiple roots with shared prefixes: if the dictionary contains both "ca" and "cat", the word "cattle" should be replaced with "ca" (the shortest root). The trie walk finds "ca" first because is_end is checked before advancing.
  • No matching root: words that do not start with any root prefix are returned unchanged.
  • Single character roots: a root like "a" will replace any word starting with "a". Make sure the trie handles single-node paths correctly.
  • Empty dictionary: if the dictionary is empty, every word stays unchanged. The trie root has no children.
  • Duplicate roots: inserting the same root twice does not break anything. The second insert walks existing nodes and sets is_end = True again.

The building blocks

This problem is built from two reusable pieces that CodeBricks drills independently:

1. Trie insertion (character-by-character traversal)

The core loop that inserts a word into the trie:

node = root
for ch in word:
    if ch not in node.children:
        node.children[ch] = TrieNode()
    node = node.children[ch]
node.is_end = True

This is the same insertion pattern from Implement Trie. You walk the tree one character at a time, creating nodes where they do not exist, and mark the end. Every trie problem uses this exact loop for building the data structure.

2. Shortest prefix search (early termination on end marker)

The lookup function that finds the shortest matching root:

node = root
for i, ch in enumerate(word):
    if node.is_end:
        return word[:i]
    if ch not in node.children:
        return word
    node = node.children[ch]

The critical difference from a normal trie search is the early return. A standard search walks the entire word and then checks is_end. Here, you check is_end at every step. The first time you hit an end marker, you have found the shortest root, and you stop immediately.

This "shortest prefix match" pattern also shows up in IP routing tables (longest prefix match is the inverse) and autocomplete systems where you want the most specific or most general match.

From understanding to recall

You have read through the Replace Words solution. You see how the trie is built from root words and how each sentence word is walked through the trie to find its shortest matching prefix. The logic is simple: insert all roots, then for each word, walk the trie and stop at the first end marker.

But can you write the full solution from scratch in five minutes? The trie insertion loop, the find_root function with its early termination, the sentence splitting and rejoining? These are small details, but under interview pressure, small details are what trip you up.

Spaced repetition closes that gap. You practice writing the trie insertion and the shortest-prefix lookup from memory at increasing intervals. After a few reps, the pattern is automatic. You see a prefix matching problem and the code flows out without hesitation.

The trie insertion pattern and the early-termination prefix search are two of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Learning them individually and drilling them with spaced repetition is far more effective than grinding random problems and hoping the patterns stick.

Related posts

  • Implement Trie - The foundation. Build the trie data structure from scratch with insert, search, and startsWith.
  • Design Add and Search Words - Extends the basic trie with wildcard DFS. Same trie structure, different search logic.
  • Longest Common Prefix - Another prefix matching problem that can also be solved with a trie, comparing characters column by column.

CodeBricks breaks the Replace Words problem into its trie-insertion and shortest-prefix-search building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a prefix matching problem shows up in your interview, you do not think about it. You just write it.