Stream of Characters: Reverse Trie for Suffix Matching
You are given a list of words and a stream of characters arriving one at a time. After each new character, you need to answer: does any suffix of the characters queried so far match one of the words? For example, if the words are ["cd", "f", "kl"] and the stream so far is ['a', 'b', 'c', 'd'], the answer is True because the suffix "cd" matches a word.
This is LeetCode 1032: Stream of Characters, and it is one of the cleanest examples of a problem where reversing the trie insertion order unlocks an elegant solution.
Why this problem matters
This problem sits at the intersection of two important patterns: trie-based string matching and streaming data processing. In real systems, you often need to match patterns against a growing stream of input, whether that is detecting keywords in a log stream, matching suffixes in network packet inspection, or triggering actions when specific sequences appear in user input.
The naive approach of checking every word against every suffix of the stream on each query is far too slow. The trie gives you a way to check all words simultaneously in a single pass, and the reverse insertion trick makes suffix matching as natural as prefix matching.
If you have solved Implement Trie and Design Add and Search Words, this problem adds one more twist: instead of searching from the beginning of a fixed string, you search from the end of a growing buffer, walking backwards.
The key insight
A standard trie is built for prefix matching. You insert words forward and search from the start. But this problem asks about suffixes, not prefixes. The suffix "cd" at the end of the stream ['a', 'b', 'c', 'd'] is what you need to detect.
The trick: insert every word into the trie in reverse order. The word "cd" becomes "dc" in the trie. Then, when a new character arrives, you append it to a buffer and walk the trie starting from the most recent character, moving backwards through the buffer.
Why does this work? Walking backwards through the buffer is equivalent to reading the suffix from right to left. And the reversed trie stores words from right to left. So a backwards walk through the buffer traces a forward path through the reversed trie. If that path hits an end-of-word node, some suffix of the stream matches a word.
You only need to walk at most M characters backwards, where M is the length of the longest word. That keeps each query bounded.
The solution
class TrieNode:
def __init__(self):
self.children = {}
self.is_end = False
class StreamChecker:
def __init__(self, words):
self.root = TrieNode()
self.buffer = []
self.max_len = 0
for word in words:
self.max_len = max(self.max_len, len(word))
node = self.root
for ch in reversed(word):
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.is_end = True
def query(self, letter):
self.buffer.append(letter)
node = self.root
for i in range(len(self.buffer) - 1, max(len(self.buffer) - self.max_len - 1, -1), -1):
ch = self.buffer[i]
if ch not in node.children:
return False
node = node.children[ch]
if node.is_end:
return True
return False
The constructor inserts each word in reverse. The word "cd" is inserted as "d" then "c", so the trie path from root goes root -> d -> c (end). We also track max_len, the length of the longest word, so we know how far back to search.
The query method appends the new character to the buffer, then walks backwards from the last character. At each step, it checks if the current buffer character exists as a child in the trie. If it reaches an end-of-word node, a suffix matches and it returns True. If it falls off the trie (no matching child), no suffix starting at this position or earlier can match, so it returns False.
The reverse trie approach converts suffix matching into prefix matching. A standard trie answers "does this string start with any word?" The reversed trie answers "does this string end with any word?" This reversal trick applies to any problem where you need suffix-based lookups over a dictionary.
Visual walkthrough
Step 1: query('a') -- buffer = [a]. Check "a" at root.
Root has no child "a". No suffix matches. Return False.
Step 2: query('b') -- buffer = [a, b]. Check "b" at root.
Root has no child "b". No suffix matches. Return False.
Step 3: query('c') -- buffer = [a, b, c]. Check "c" at root.
Root has no child "c". No suffix matches. Return False.
Step 4: query('d') -- buffer = [a, b, c, d]. Check "d" then "c" in reverse trie.
Start at buffer end: "d" matches root child. Move back: "c" matches and is end-of-word. The suffix "cd" is a word. Return True.
Step 5: query('f') -- buffer = [a, b, c, d, f]. Check "f" at root.
Start at buffer end: "f" matches root child and is end-of-word. The suffix "f" is a word. Return True.
The critical moment is Step 4. When 'd' arrives, the query walks backwards: 'd' matches the root's child, then 'c' matches and is an end-of-word node. The suffix "cd" was found. Notice that the earlier characters 'a' and 'b' are never checked because the match was found after just two steps.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Reverse trie | O(M) per query, O(N) build | O(N) |
Build time is O(N) where N is the total number of characters across all words. Each character is inserted into the trie exactly once.
Query time is O(M) where M is the length of the longest word. Each query walks at most M characters backwards through the buffer. In practice, many queries terminate early when the trie has no matching child.
Space is O(N) for the trie itself. The buffer grows with each query, but you can cap it at length M since older characters can never participate in a match (no word is longer than M). That optimization keeps buffer space at O(M).
The building blocks
1. Trie insertion (reversed)
The insertion loop is almost identical to a standard trie insert, except you iterate over the word in reverse:
for word in words:
node = self.root
for ch in reversed(word):
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.is_end = True
This is the same character-by-character traversal from Implement Trie, just with reversed(word) instead of word. The trie structure, the TrieNode class, and the end-of-word marker are all unchanged. If you can write a standard trie insert, you can write this one.
2. Suffix matching via buffer traversal
The query method walks the buffer backwards while walking the trie forwards:
node = self.root
for i in range(len(self.buffer) - 1, max(len(self.buffer) - self.max_len - 1, -1), -1):
ch = self.buffer[i]
if ch not in node.children:
return False
node = node.children[ch]
if node.is_end:
return True
return False
This is the matching counterpart to the reversed insert. You start at the end of the buffer (the most recent character) and walk backwards. Each character you read corresponds to one step deeper into the reversed trie. If you reach an end-of-word node, a complete word has been matched as a suffix of the stream. If the trie has no child for the current character, no word can match, so you stop early.
Edge cases
- Single character words: If
"f"is a word, querying'f'should immediately returnTrue. The reversed insertion of a single character is just that character. - Overlapping suffixes: If words are
["cd", "bcd"]and the stream is['b', 'c', 'd'], the query for'd'should returnTrue. The walk finds "cd" first (at depth 2) before needing to check "bcd". The short-circuit onis_endhandles this. - No match at all: If no suffix ever matches, every query returns
False. The trie walk fails at the root because the most recent character has no matching child. - Buffer longer than longest word: Only the last
Mcharacters matter. Characters further back cannot be part of any match. - Duplicate words: Inserting the same word twice is harmless. The second insert walks existing nodes and sets
is_end = Trueagain. - All queries match: If one of the words is a single character that appears in every query, every call returns
Trueafter just one step.
From understanding to recall
You have seen how the reverse trie transforms suffix matching into a trie walk. The core ideas are simple: insert words backwards, keep a buffer, and walk the trie from the most recent character backwards on each query. But can you write the entire StreamChecker class from scratch, including the reversed insertion and the backwards buffer walk, without looking at the code?
That is the gap between understanding and fluency. The reversed range call, the max_len optimization, the early termination on is_end, these are small details that trip you up under time pressure if you have not drilled them.
Spaced repetition bridges that gap. You practice writing the reversed trie insert and the backward buffer walk from memory at increasing intervals. After a few rounds, the pattern becomes automatic. You see a streaming suffix match problem and the code flows out without hesitation.
Related posts
- Implement Trie - The foundation for all trie problems. If you have not built a basic trie from scratch, start here before tackling the reverse trie.
- Design Add and Search Words - Another trie design problem that adds DFS branching for wildcard characters. Same trie skeleton, different search twist.
- Word Break - Uses dictionary lookups during dynamic programming. A trie can replace the hash set for these lookups, connecting trie traversal with DP.
CodeBricks breaks the stream of characters problem into its reversed-trie-insert and backward-buffer-walk building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a streaming trie problem shows up in your interview, you do not think about it. You just write it.