Prefix and Suffix Search: Trie with Combined Keys
You are given a list of words and need to design a data structure that supports one operation: given a prefix and a suffix, find the word in the list that has that prefix and that suffix. If multiple words match, return the one with the largest index. This is LeetCode 745: Prefix and Suffix Search, and it is one of the trickiest trie problems because neither a prefix trie nor a suffix trie alone solves it. You need both at the same time.
Why this problem matters
Most trie problems ask you to match a prefix or search for an exact word. This problem asks you to match a prefix and a suffix simultaneously. That makes it fundamentally different from Implement Trie or Design Add and Search Words.
A naive approach would build a prefix trie, find all words matching the prefix, then filter by suffix. But that filtering step can be expensive. The real challenge is designing a single lookup that handles both constraints in one pass.
This problem is a great test of your ability to transform a hard constraint into a clever data structure trick. The combined key approach is elegant and reusable. You will see similar "encode two constraints into one key" ideas in database indexing, search engines, and cache key design.
The brute force approach
The simplest idea: for each query, scan every word and check if it starts with the given prefix and ends with the given suffix. Return the largest matching index.
class WordFilter:
def __init__(self, words):
self.words = words
def f(self, pref, suff):
for i in range(len(self.words) - 1, -1, -1):
if self.words[i].startswith(pref) and self.words[i].endswith(suff):
return i
return -1
This works, but each query takes O(n * m) time where n is the number of words and m is the maximum word length. If there are many queries, this adds up fast. We need a way to precompute the answers.
The combined key trie approach
Here is the key insight: instead of maintaining separate prefix and suffix structures, combine them into a single trie key. For each word, generate every possible suffix of that word, concatenate it with "#" and then the full word, and insert that combined key into a trie.
For example, if the word is "apple" at index 0, you insert these keys:
"e#apple"(suffix "e")"le#apple"(suffix "le")"ple#apple"(suffix "ple")"pple#apple"(suffix "pple")"apple#apple"(suffix "apple")
The "#" character acts as a separator that never appears in the words themselves. When you query with prefix "ap" and suffix "e", you look up "e#ap" as a prefix in the trie. Any key that starts with "e#ap" represents a word that ends with "e" and starts with "ap". That is exactly what we want.
The trick to making queries fast: at each node during insertion, store the largest word index that passed through that node. Then a query only needs to walk the trie along the combined key and read the index stored at the last node. No need to traverse the subtree.
The full solution
class TrieNode:
def __init__(self):
self.children = {}
self.weight = -1
class WordFilter:
def __init__(self, words):
self.root = TrieNode()
for idx, word in enumerate(words):
for i in range(len(word)):
key = word[i:] + "#" + word
node = self.root
node.weight = idx
for ch in key:
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.weight = idx
def f(self, pref, suff):
node = self.root
key = suff + "#" + pref
for ch in key:
if ch not in node.children:
return -1
node = node.children[ch]
return node.weight
Let's break this down.
__init__: For each word at index idx, iterate over every suffix of that word. Concatenate suffix + "#" + word to form the combined key. Insert that key into the trie character by character. At every node along the path, update weight to idx. Because we process words in order from index 0 to n-1, later words naturally overwrite earlier ones. So weight always holds the largest index, which is exactly what the problem asks for.
f(pref, suff): Build the lookup key suff + "#" + pref. Walk the trie along this key. If any character is missing, no word matches both constraints, so return -1. If you reach the end, the weight at that node is the answer.
Visual walkthrough
Let's trace through how the combined key trie is built and queried for words ["apple", "ape"].
Step 1: For word "apple" (index 0), generate every suffix + "#" + word combination.
Each suffix of "apple" is paired with "#apple". This produces: "e#apple", "le#apple", "ple#apple", "pple#apple", "apple#apple". The suffix comes first, then the separator, then the full word.
Step 2: Insert each combined key into a trie. Store the word index (0) at every end node.
All five keys for "apple" share structure in the trie. "e#apple" and "le#apple" share nothing at the top level, but their "#apple" suffixes merge deeper. Each end node stores index 0.
Step 3: Process word "ape" (index 1). Generate: "e#ape", "pe#ape", "ape#ape". Insert all into the trie.
"e#ape" and "e#apple" share the prefix "e#a". After the "a" node they diverge: one continues "pple" and the other continues "pe". The "e#ape" end node stores index 1.
Step 4: Query f("ap", "e"). Build the lookup key: suffix + "#" + prefix = "e#ap".
To find words matching prefix "ap" and suffix "e", we concatenate: suffix + "#" + prefix = "e#ap". We will search the trie for any key that starts with "e#ap".
Step 5: Walk the trie along "e" -> "#" -> "a" -> "p". Every end node below stores a valid answer.
We walk "e#ap" and reach the "p" node. Below it are "apple" (index 0) and "ape" (index 1). The trick: during insertion we store the max index at every node along the path. So the "p" node already holds 1, and we return it immediately without traversing further.
The critical insight is in Step 5. By storing the maximum index at every node during insertion, the query never needs to explore the subtree below the matched prefix. It reads the answer directly from the node where the walk ends. This is what makes the query O(m) instead of potentially scanning many branches.
Complexity analysis
Let n be the number of words and m be the maximum word length.
| Operation | Time | Space |
|---|---|---|
__init__ | O(n * m^2) | O(n * m^2) |
f(pref, suff) | O(m) | O(1) |
Construction is O(n * m^2) because each word of length m generates m suffixes, and each combined key has length up to 2m + 1. Inserting one key is O(m), and there are m keys per word, so each word costs O(m^2). Across n words, that is O(n * m^2).
Space is O(n * m^2) for the trie. Each word contributes up to m keys, each of length up to 2m + 1. In the worst case with no shared prefixes, the trie stores all of these characters.
Query is O(m). The lookup key suff + "#" + pref has length at most 2m + 1. Walking the trie is one step per character.
The trade-off is clear: we spend extra time and space upfront during construction so that every query is fast. This is a good deal when the number of queries is large relative to the number of words.
The building blocks
This problem is built from two reusable pieces that CodeBricks drills independently:
1. Trie insertion and prefix lookup
The core trie pattern from Implement Trie:
node = self.root
for ch in key:
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
This is the same character-by-character traversal that powers every trie problem. Insert walks the tree and creates nodes. Lookup walks the tree and returns -1 on a missing child. The combined key approach does not change this loop at all. It only changes what string you feed into it.
2. Encoding multiple constraints into one key
The clever part of this problem is the key transformation: suffix + "#" + word. This encodes both the suffix constraint and the prefix constraint into a single string that a standard trie can handle. The same encoding idea appears in many contexts:
- Composite database indexes: a compound index on (last_name, first_name) lets you query by last name alone or by both, similar to how the combined key lets you match suffix alone or suffix-plus-prefix.
- Cache keys: encoding multiple query parameters into a single cache key string.
- Rolling hash problems: combining two hashes into one value to match two patterns simultaneously.
The pattern is: when you need to satisfy two constraints at once, see if you can merge them into a single lookup key that a standard data structure can handle.
Edge cases
Before submitting, verify these:
- Prefix or suffix is the entire word:
f("apple", "apple")should work. The lookup key is"apple#apple", which is one of the keys we insert. - Single character words:
f("a", "a")for word"a". The combined key is"a#a", which is short but valid. - No match:
f("xyz", "e")should return -1 when no word starts with "xyz". - Multiple words match: when both "apple" and "ape" match, return the one with the larger index. The
weightfield handles this automatically because later insertions overwrite earlier ones. - Duplicate words: if the same word appears multiple times in the list, the largest index wins. Since we process words in order and always update
weight, duplicates are handled correctly. - Empty prefix or suffix: the problem guarantees prefix and suffix are non-empty, but the combined key approach would still work with empty strings since
"" + "#" + wordis just"#word".
From understanding to recall
You have read through the combined key trie approach. You see how generating suffix + "#" + word transforms a dual-constraint search into a single trie prefix lookup. You understand why storing the max index at every node makes queries O(m). But can you write the entire WordFilter class from scratch in five minutes?
That is the real test. The solution has subtle details: generating every suffix of every word, concatenating with the right separator, updating weight at every node along the insertion path (not just the end), and building the query key as suff + "#" + pref (not pref + "#" + suff). These are easy to mix up under pressure.
Spaced repetition closes that gap. You practice writing the key generation loop and the trie insertion from memory at increasing intervals. After a few reps, the pattern is automatic. You see a prefix-suffix constraint problem and the combined key trick is the first thing that comes to mind.
The trie insertion pattern and the constraint-encoding pattern are two of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Learning them individually and drilling them with spaced repetition is far more effective than grinding random problems and hoping the patterns stick.
Related posts
- Implement Trie - The foundation for this problem. Master the basic trie insert, search, and startsWith before attempting the combined key approach.
- Design Add and Search Words - Another trie design problem that extends the basic trie with wildcard DFS. Good practice for trie variations.
- Longest Common Prefix - A simpler prefix problem that helps build intuition for why tries are useful for prefix-based queries.
CodeBricks breaks the prefix and suffix search problem into its trie-insertion and constraint-encoding building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a trie design problem shows up in your interview, you do not think about it. You just write it.