Skip to content
← All posts

Prefix and Suffix Search: Combined Key Trie

6 min read
leetcodeproblemhardtriedesign

You are given a list of words and need to build a data structure that answers one question: given a prefix and a suffix, which word in the list matches both? If multiple words qualify, return the one with the largest index. This is LeetCode 745: Prefix and Suffix Search, and it is one of the hardest trie problems because a standard prefix trie only handles half the constraint. You need to match both ends of the word simultaneously.

e#appleep#approotesuffix#apprefixpleidx 0eidx 2psuffix#appidx 1# separatorend of keye#applee#apep#app
Combined key trie for words ["apple", "app", "ape"]. Each key is suffix + "#" + word. The highlighted path shows how suffix="e" and prefix="ap" share the path "e#ap", branching into "apple" (index 0) and "ape" (index 2).

Why this problem matters

Most trie problems ask you to match a prefix or look up an exact word. This problem raises the bar by requiring you to match a prefix and a suffix at the same time. That makes it fundamentally different from Implement Trie or Design Add and Search Words.

The naive approach would build a prefix trie, collect all words matching the prefix, then filter by suffix. But that filtering step can scan many words. The real challenge is designing a single lookup that handles both constraints in one pass.

This problem teaches a transferable skill: encoding multiple constraints into a single key so that a standard data structure can handle them. You will see the same idea in database composite indexes, cache key design, and search engine indexing.

The approach

The key insight is to combine the suffix and prefix into a single trie key. For each word, generate every possible suffix of that word, concatenate it with "#" as a separator, then append the full word. Insert each combined key into a single trie.

For example, the word "apple" at index 0 produces these keys:

  • "e#apple" (suffix "e")
  • "le#apple" (suffix "le")
  • "ple#apple" (suffix "ple")
  • "pple#apple" (suffix "pple")
  • "apple#apple" (suffix "apple")

The "#" character acts as a separator that never appears in the input words. When you query with prefix "ap" and suffix "e", you look up "e#ap" as a prefix in the trie. Any stored key that starts with "e#ap" represents a word ending with "e" and starting with "ap".

The final trick: at every node during insertion, store the largest word index that passed through it. Then a query only needs to walk the trie along the combined key and read the stored index at the final node. No subtree traversal needed.

Clean solution

class TrieNode:
    def __init__(self):
        self.children = {}
        self.weight = -1

class WordFilter:
    def __init__(self, words):
        self.root = TrieNode()
        for idx, word in enumerate(words):
            for i in range(len(word)):
                key = word[i:] + "#" + word
                node = self.root
                node.weight = idx
                for ch in key:
                    if ch not in node.children:
                        node.children[ch] = TrieNode()
                    node = node.children[ch]
                    node.weight = idx

    def f(self, pref, suff):
        node = self.root
        key = suff + "#" + pref
        for ch in key:
            if ch not in node.children:
                return -1
            node = node.children[ch]
        return node.weight

Step-by-step walkthrough

Step 1: For word "apple" (index 0), generate every suffix + "#" + word combination.

esuffix 1#applelesuffix 2#appleplesuffix 3#apple

Each suffix of "apple" is paired with "#apple". This produces: "e#apple", "le#apple", "ple#apple", "pple#apple", "apple#apple". The suffix comes first, then the separator, then the full word.

Step 2: Insert each combined key into a trie. Store the word index (0) at every node along each path.

elp###roote###lp

All five keys for "apple" share structure in the trie. "e#apple" and "le#apple" share nothing at the top level, but their "#apple" suffixes merge deeper. Each node stores weight = 0.

Step 3: Process word "ape" (index 1). Generate: "e#ape", "pe#ape", "ape#ape". Insert all into the trie.

e#aaroote#aapple pathaape path

"e#ape" and "e#apple" share the prefix "e#a". After the "a" node they diverge: one continues "pple" and the other continues "pe". The "e#ape" end node stores index 1, and shared nodes update to weight = 1.

Step 4: Query f("ap", "e"). Build the lookup key: suffix + "#" + prefix = "e#ap".

esuffix#separatoraprefix startpprefix end

To find words matching prefix "ap" and suffix "e", concatenate: suffix + "#" + prefix = "e#ap". Search the trie for any key starting with "e#ap".

Step 5: Walk the trie along "e" -> "#" -> "a" -> "p". Read the stored weight at the final node.

e#apperoote#apweight = 1pidx 0eidx 1

We walk "e#ap" and reach the "p" node. Below it are "apple" (index 0) and "ape" (index 1). During insertion, we stored the max index at every node along each path. The "p" node holds weight = 1, so we return 1 immediately.

The critical insight is in the final step. By storing the maximum index at every node during insertion, the query never needs to explore the subtree below the matched prefix. It reads the answer directly from the node where the walk ends. This is what makes each query O(m) instead of potentially scanning many branches.

Constructor: for each word at index idx, iterate over every suffix of that word. Concatenate suffix + "#" + word to form the combined key. Insert that key into the trie character by character. At every node along the path, update weight to idx. Because we process words from index 0 to n-1, later words naturally overwrite earlier ones, so weight always holds the largest index.

Query: build the lookup key as suff + "#" + pref. Walk the trie along this key. If any character is missing, return -1. Otherwise, the weight at the final node is the answer.

Pay attention to the key construction order. During insertion, the key is suffix + "#" + word (the full word). During query, the key is suffix + "#" + prefix (just the prefix). The query key is a prefix of the insertion key, which is exactly what makes trie prefix matching work.

Complexity analysis

ApproachTimeSpace
Brute force per queryO(n * m) per queryO(n * m)
Combined key trie, initO(n * m^2)O(n * m^2)
Combined key trie, queryO(m)O(1)

Here n is the number of words and m is the maximum word length.

Construction costs O(n * m^2) because each word of length m generates m suffixes, and each combined key has length up to 2m + 1. Inserting one key is O(m), and there are m keys per word, giving O(m^2) per word. Across n words, that totals O(n * m^2).

Space is O(n * m^2) for the trie nodes. Each word contributes up to m keys, each of length up to 2m + 1.

Query is O(m). The lookup key suff + "#" + pref has length at most 2m + 1, and walking the trie is one step per character.

The trade-off is clear: spend extra time and space during construction so that every query is fast. This is a good deal when the number of queries is large relative to the number of words.

The building blocks

This problem combines two reusable patterns:

Trie insertion and prefix lookup. The core loop from Implement Trie powers every trie problem:

node = self.root
for ch in key:
    if ch not in node.children:
        node.children[ch] = TrieNode()
    node = node.children[ch]

The combined key approach does not change this loop. It only changes what string you feed into it.

Encoding multiple constraints into one key. The clever part is the key transformation: suffix + "#" + word. This encodes both the suffix constraint and the prefix constraint into a single string that a standard trie can handle. The same encoding idea shows up in composite database indexes, compound cache keys, and rolling hash problems. The pattern: when you need to satisfy two constraints at once, see if you can merge them into a single lookup key.

Edge cases to watch for

  • Prefix or suffix equals the entire word: f("apple", "apple") should work. The lookup key "apple#apple" is one of the keys we insert.
  • Single character words: f("a", "a") for word "a". The combined key "a#a" is short but valid.
  • No match: f("xyz", "e") returns -1 when no word starts with "xyz".
  • Multiple words match: when both "apple" and "ape" match, return the one with the larger index. The weight field handles this automatically because later insertions overwrite earlier ones.
  • Duplicate words: if the same word appears multiple times, the largest index wins. Processing words in order and always updating weight handles duplicates correctly.
  • Empty prefix or suffix: the problem guarantees these are non-empty, but the approach would still work with empty strings since "" + "#" + word is just "#word".

From understanding to recall

You have walked through the combined key trie approach. You see how generating suffix + "#" + word transforms a dual-constraint search into a single trie prefix lookup. You understand why storing the max index at every node makes queries O(m). But can you write the entire WordFilter class from scratch in five minutes?

That is the real test. The solution has subtle details: generating every suffix of every word, concatenating with the right separator, updating weight at every node along the insertion path (not just the end), and building the query key as suff + "#" + pref (not pref + "#" + suff). These are easy to mix up under pressure.

Spaced repetition closes that gap. You practice writing the key generation loop and the trie insertion from memory at increasing intervals. After a few reps, the pattern becomes automatic. You see a prefix-suffix constraint problem and the combined key trick is the first thing that comes to mind.

Related posts