Skip to content
← All posts

Implement Trie: The Prefix Tree

10 min read
leetcodeproblemmediumdesign

You are given a blank slate and asked to build a data structure that can insert words, search for exact words, and check whether any word starts with a given prefix. All three operations should run in O(m) time, where m is the length of the input string.

This is LeetCode 208: Implement Trie (Prefix Tree), and it is one of the most important design problems for interviews. The trie (pronounced "try") comes up directly in autocomplete systems, spell checkers, and IP routing tables. It also appears indirectly as the backbone of harder LeetCode problems like Word Search II.

abpapttdlerootabpapttdleend of wordshared prefix
A trie holding "app", "apple", "apt", "bat", and "bad". The green dots mark end-of-word nodes. Notice how "app" and "apple" share the prefix "app", and "bat" and "bad" share "ba".

Why this problem matters

A hash set can tell you whether a word exists in O(1) average time. But it cannot efficiently answer "are there any words that start with this prefix?" You would have to iterate over every word in the set and check each one. That is O(n * m) where n is the number of words.

A trie answers prefix queries in O(m) time regardless of how many words are stored. It does this by sharing common prefixes between words. The words "app", "apple", and "apt" all share the prefix "ap", so those characters are stored only once in the trie. Each character gets its own node, and children branch off from their parent character.

The implement trie LeetCode problem tests whether you can translate this idea into working code. The API is simple: insert(word), search(word), and startsWith(prefix). The challenge is getting the TrieNode structure right and understanding the difference between search and startsWith.

The TrieNode class

Every trie starts with a TrieNode. Each node has two things:

  1. A dictionary mapping characters to child nodes
  2. A boolean flag indicating whether this node marks the end of a complete word
class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end = False

That is the entire node. No value field, no parent pointer. The character itself is not stored in the node. Instead, it is represented by the key in the parent's children dictionary. This is a common point of confusion. The node at root.children["a"] represents the character "a", but the node itself does not store "a" anywhere.

Some implementations use a fixed-size array of 26 entries (one per lowercase letter) instead of a dictionary. That trades memory for slightly faster lookups. For interviews, the dictionary approach is cleaner and handles arbitrary character sets.

The key insight

All three operations (insert, search, startsWith) share the same core loop: walk down the trie one character at a time, following the path dictated by the input string. The only difference is what you do when the walk ends:

  • insert: create missing nodes along the way, then mark the final node as end-of-word
  • search: if you reach the end and the node is marked as end-of-word, return True
  • startsWith: if you reach the end at all (no missing nodes), return True

The difference between search and startsWith is just that one line: checking is_end. If you insert "apple" into the trie, search("app") returns False because the node for the second "p" is not marked as end-of-word. But startsWith("app") returns True because the path exists.

Visual walkthrough

Let's trace through inserting "cat", "car", and "card" into an empty trie. Watch how shared prefixes reuse existing nodes instead of creating duplicates.

Step 1: Insert "cat"

cat

Create three new nodes: c, a, t. Mark "t" as end of word.

Step 2: Insert "car"

catr

Walk the existing "c" and "a" nodes. Only "r" is new. The prefix "ca" is shared.

Step 3: Insert "card"

catrd

Walk "c", "a", "r" (all exist). Only "d" is new. "car" is still a valid word because its end marker stays.

The key observation: when we insert "car", we walk the existing "c" and "a" nodes without creating anything new. We only create the "r" node. When we insert "card", we walk "c", "a", and "r" (all exist), then create just "d". The prefix tree grows only where new characters are needed.

The full solution

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end = False

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word: str) -> None:
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.is_end = True

    def search(self, word: str) -> bool:
        node = self.root
        for ch in word:
            if ch not in node.children:
                return False
            node = node.children[ch]
        return node.is_end

    def startsWith(self, prefix: str) -> bool:
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return False
            node = node.children[ch]
        return True

Let's break down each method.

insert(word): Start at the root. For each character in the word, check if the current node already has a child for that character. If not, create a new TrieNode and add it. Either way, move to the child node. After processing every character, mark the current node as the end of a word by setting is_end = True.

search(word): Start at the root. For each character, try to follow the path down through children. If any character is missing, the word was never inserted, so return False. If you successfully walk all characters, check whether the final node is marked as end-of-word. This is what distinguishes "app" from "apple" when only "apple" was inserted.

startsWith(prefix): Identical to search, except at the end you return True unconditionally. If the path exists, some word with that prefix was inserted. You do not care whether the final node is an end-of-word node.

Notice the duplication between search and startsWith. A common refactor is to extract a helper method that walks the trie and returns the final node (or None if the path does not exist). Then search checks node.is_end and startsWith checks node is not None. This is cleaner but not required for the interview.

Complexity analysis

Let m be the length of the word or prefix being processed.

OperationTimeSpace
insertO(m)O(m) worst case (all new nodes)
searchO(m)O(1)
startsWithO(m)O(1)

Time is O(m) for all three operations. Each one walks through at most m characters, doing O(1) work per character (dictionary lookup or insertion).

Space for the entire trie is O(n * m) in the worst case, where n is the number of words and m is the average word length. This happens when no words share any prefixes. In practice, natural language has massive prefix overlap, so the actual space is much less.

Edge cases

Before submitting, verify these:

  • Empty string insert and search: inserting "" should mark the root node as end-of-word. search("") should return True after that, and startsWith("") should always return True.
  • Prefix is a word: insert "apple", then search "app". Should return False unless "app" was also explicitly inserted. The is_end flag handles this.
  • Word is a prefix of another: insert "app" and "apple". Both search("app") and search("apple") should return True. Inserting "apple" walks through the "app" nodes but does not remove the end marker on "app".
  • Single character words: insert "a", search "a". Make sure your loop handles single iterations correctly.
  • Repeated inserts: inserting the same word twice should not break anything. The second insert just walks existing nodes and sets is_end = True again (which is already True).

The building blocks

This problem is built from two reusable pieces that CodeBricks drills independently:

1. Character-by-character tree traversal

The core loop that powers all three trie operations:

node = self.root
for ch in word:
    if ch not in node.children:
        # handle missing child (create or return False)
        ...
    node = node.children[ch]

This is the fundamental trie traversal pattern. You start at the root and walk one character at a time. Every trie problem uses exactly this loop. The only thing that changes is what you do when a child is missing and what you do after the loop ends.

This same traversal pattern also appears in Word Search II (LeetCode 212), where you walk a trie in parallel with a grid DFS. It shows up in autocomplete systems where you walk the trie to the prefix node, then collect all words below it. Once you can write this loop from memory, you have the skeleton for every trie problem.

2. End-of-word markers

The is_end boolean is what separates "the prefix exists" from "this exact word was inserted." Without it, you cannot distinguish between a word and a prefix of a longer word. This concept is simple but forgetting it is the most common bug in trie implementations.

The end marker pattern also shows up in:

  • Trie-based word dictionaries where you need to know if a path represents a complete word
  • Word Break (LeetCode 139) where you check if substrings are complete dictionary words
  • Replace Words (LeetCode 648) where you need the shortest prefix that is a complete word

The most common bug: using search logic for startsWith or vice versa. Always double-check whether you need to verify is_end. If the problem asks "does this exact word exist," check it. If it asks "does any word start with this," skip it.

Why a trie instead of a hash set?

For pure word lookup, a hash set is simpler and just as fast. The trie wins when you need prefix operations. Here is a quick comparison:

OperationHash SetTrie
Insert wordO(m)O(m)
Search exact wordO(m)O(m)
Check prefixO(n * m)O(m)
Autocomplete from prefixO(n * m)O(m + k)

In the table above, n is the number of stored words and k is the number of results. The trie turns a linear scan into a tree walk, which is the whole point.

The implement trie LeetCode problem specifically requires startsWith, which forces the trie approach. But even beyond this problem, the prefix tree data structure is worth knowing because it is the foundation for more advanced structures like suffix tries and radix trees.

Common mistakes

1. Forgetting is_end. If you skip the end-of-word marker, search returns True for any prefix of any inserted word. Insert "apple", search "app" returns True incorrectly.

2. Creating nodes during search. The insert method creates missing children. The search and startsWith methods must not. If you accidentally reuse the insert logic for search, you will create phantom nodes and corrupt the trie.

3. Mutating the root. Make sure you use a temporary variable (node) to walk the trie. If you accidentally write self.root = self.root.children[ch], you lose the entire trie above that point.

4. Using a list instead of a dictionary for children. A list of 26 entries works for lowercase English letters but breaks if the problem includes uppercase, digits, or other characters. The dictionary approach is safer for interviews.

When to use a trie

Look for these signals:

  • The problem involves prefix matching or prefix queries
  • You need to store a dictionary of words and do lookups by prefix
  • The problem mentions autocomplete, spell check, or word suggestions
  • You are combining word lookup with another traversal (like grid DFS in Word Search II)
  • You need to find the longest or shortest prefix that matches

Problems that use the trie pattern:

  • Word Search II (LeetCode 212): Trie plus grid backtracking to find all words from a dictionary in a grid. This is the natural next step after implementing a basic trie.
  • Replace Words (LeetCode 648): Walk each word through a trie to find its shortest dictionary root.
  • Design Add and Search Words (LeetCode 211): Trie with wildcard search using DFS.
  • Word Break (LeetCode 139): Can be solved with a trie to check dictionary membership during DP.

From understanding to recall

You have read through the trie implementation. You see how the TrieNode class works, how the three methods share the same traversal loop, and why is_end matters. But can you write the entire Trie class from scratch in five minutes under interview pressure?

That is the real test. The trie has subtle details that are easy to mix up: initializing children as a dictionary (not a list), creating nodes only during insert, checking is_end only during search. These are not hard concepts. They are recall challenges.

Spaced repetition closes that gap. You practice writing the TrieNode class and the insert/search/startsWith methods from memory at increasing intervals. After a few reps, the pattern is automatic. You see a prefix tree problem and the code flows out without hesitation.

The character-by-character traversal pattern and the end-of-word marker pattern are two of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Learning them individually and drilling them with spaced repetition is far more effective than grinding random problems and hoping the patterns stick.

Related posts

  • Word Search - Grid backtracking that pairs naturally with tries in the harder Word Search II variant
  • Trees: Binary Trees, BSTs, and Traversals - Understanding tree structures in general helps you see a trie as a specialized tree with branching factor up to 26
  • LRU Cache - Another medium-difficulty design problem that asks you to build a data structure from scratch

CodeBricks breaks the implement trie problem into its character-by-character traversal and end-of-word marker building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a prefix tree problem shows up in your interview, you do not think about it. You just write it.