Maximum Genetic Difference Query: Trie + DFS
LeetCode Maximum Genetic Difference Query (Problem 1938) asks you to answer XOR queries on tree paths. You are given a rooted tree and a list of queries. Each query says: "at this node, what is the maximum XOR between some value and any node on the path from the root to this node?" The trick is that the tree can have up to 10^5 nodes and 3 * 10^4 queries, so brute force is out of the question. You need a data structure that evolves as you traverse the tree.
The problem
You are given an array parents where parents[i] is the parent of node i. The root is the node where parents[i] == -1. You are also given a 2D array queries where queries[j] = [node_j, val_j]. For each query, find the maximum genetic difference, defined as val_j XOR x where x is any node label on the path from the root to node_j (inclusive). Return the answer for each query.
# Example
parents = [-1, 0, 0, 2]
queries = [[1, 2], [3, 5]]
# Tree:
# 0 (root)
# / \
# 1 2
# |
# 3
# Query (node=1, val=2): path = [0, 1]
# 2 XOR 0 = 2, 2 XOR 1 = 3 -> max = 3
# Query (node=3, val=5): path = [0, 2, 3]
# 5 XOR 0 = 5, 5 XOR 2 = 7, 5 XOR 3 = 6 -> max = 7
# Output: [3, 7]
Why brute force fails
The naive approach is simple: for each query (node, val), walk from the node up to the root, XOR val with every node on the path, and take the maximum. If the tree is a straight chain of n nodes, the path length is O(n). With q queries, the total work is O(n * q), which hits 3 * 10^9 for the given constraints. That is far too slow.
Even if you precompute the path for each node, you still need to XOR val with every node on that path. The problem is not finding the path. The problem is answering the XOR query efficiently over an arbitrary set of numbers that changes as you move through the tree.
The key insight: trie on the DFS path
The key observation combines two ideas. First, a binary trie can answer "what is the maximum XOR of this value with any number in the trie?" in O(L) time, where L is the number of bits (typically 18 or 20 for this problem's constraints). Second, a DFS traversal visits every root-to-node path exactly once. As you enter a node during DFS, the current call stack holds exactly the path from the root to that node.
The algorithm works like this:
- Build a bitwise trie that supports insert with reference counting and delete (decrement count).
- Run a DFS from the root. When you enter a node, insert that node's label into the trie.
- While at a node, answer all queries associated with that node by querying the trie for maximum XOR with the query's
val. - When the DFS backtracks out of a node, remove that node's label from the trie.
At any point during the DFS, the trie contains exactly the node labels on the current root-to-node path. That is precisely the set of numbers each query needs to search over. By grouping queries by their target node, you answer each query at exactly the right moment.
Step-by-step walkthrough
Here is the DFS in action on the small example above, with queries (node=1, val=2) and (node=3, val=5).
Step 1: DFS enters node 0. Insert 0 (000) into trie.
Trie now contains {0}. No queries at node 0, so continue DFS to children.
Step 2: DFS enters node 1. Insert 1 (001) into trie. Answer query (node=1, val=2).
Trie has {0, 1}. Path to node 1 is [0, 1]. Query max XOR of 2 with path nodes: 2^0=2, 2^1=3. The trie finds 3 greedily. Answer = 3.
Step 3: Backtrack from node 1 (remove 1 from trie). DFS enters node 2. Insert 2 (010).
Removing node 1's bits from the trie restores it to just {0}. Then inserting 2 (010) adds a new branch. Trie now has {0, 2}.
Step 4: DFS enters node 3. Insert 3 (011). Answer query (node=3, val=5).
Trie has {0, 2, 3}. The path to node 3 is [0, 2, 3]. For val=5 (101), the trie greedily picks node 2 (010), giving 5^2=7. Answer = 7.
Step 5: DFS backtracks. Remove 3 from trie, then remove 2, then remove 0. Done.
As DFS leaves each node, its bits are removed from the trie (decrement counters). All queries have been answered: query(1, 2)=3, query(3, 5)=7.
Notice two things. First, the trie never holds more than O(depth) entries at once. Second, each node is inserted and removed exactly once, so the total trie operations across the entire DFS are O(n * L).
The code
def max_genetic_difference(parents, queries):
n = len(parents)
BITS = 18
root = -1
children = [[] for _ in range(n)]
for i, p in enumerate(parents):
if p == -1:
root = i
else:
children[p].append(i)
query_map = [[] for _ in range(n)]
for j, (node, val) in enumerate(queries):
query_map[node].append((val, j))
ans = [0] * len(queries)
trie = {}
def insert(num):
node = trie
for bit in range(BITS, -1, -1):
b = (num >> bit) & 1
if b not in node:
node[b] = {"cnt": 0}
node[b]["cnt"] += 1
node = node[b]
def remove(num):
node = trie
for bit in range(BITS, -1, -1):
b = (num >> bit) & 1
node[b]["cnt"] -= 1
if node[b]["cnt"] == 0:
del node[b]
return
node = node[b]
def query(val):
node = trie
result = 0
for bit in range(BITS, -1, -1):
b = (val >> bit) & 1
opposite = 1 - b
if opposite in node:
result |= 1 << bit
node = node[opposite]
else:
node = node[b]
return result
stack = [(root, False)]
while stack:
u, leaving = stack.pop()
if leaving:
remove(u)
continue
insert(u)
stack.append((u, True))
for qval, qidx in query_map[u]:
ans[qidx] = query(qval)
for child in children[u]:
stack.append((child, False))
return ans
- Build the tree. Parse the
parentsarray into an adjacency list (children). Identify the root node. - Group queries by node. For each query
(node, val), store(val, original_index)inquery_map[node]. This lets you answer queries when the DFS reaches that node. - Trie with reference counts. Each trie node carries a
cntfield. Insert increments the count at each level. Remove decrements it and prunes subtrees that reach zero. This avoids leaving dead paths in the trie. - Iterative DFS with a "leaving" flag. Each stack entry is
(node, leaving). On the first visit (leaving=False), insert the node into the trie, push a cleanup entry(node, True), answer queries, and push children. When the cleanup entry is popped (leaving=True), remove the node from the trie. This simulates recursive DFS with backtracking. - Greedy trie query. For each query value, walk the trie from the MSB to LSB. At each level, prefer the opposite bit to maximize the XOR result. If the opposite bit does not exist, follow the same bit.
Complexity analysis
Let n be the number of nodes, q the number of queries, and L the number of bits (about 18 for node values up to 2 * 10^5).
| Approach | Time | Space |
|---|---|---|
| Brute force (walk path per query) | O(n * q) | O(n) |
| Trie + DFS | O((n + q) * L) | O(n * L) |
Each node is inserted into and removed from the trie once during the DFS, costing O(L) per operation, for a total of O(n * L). Each query performs one trie traversal costing O(L), for a total of O(q * L). The trie holds at most O(depth * L) nodes at any time, but across the entire run it allocates at most O(n * L) nodes total.
Building blocks
1. Bitwise trie for maximum XOR
This is the same pattern from Maximum XOR of Two Numbers in an Array. Build a trie where each level represents one bit position, MSB first. To find the maximum XOR with a given value, walk the trie greedily: at each level, try to go opposite to the current bit of the query value. Going opposite sets that bit to 1 in the XOR result, which is always better.
def query(trie, val, bits=18):
node = trie
result = 0
for bit in range(bits, -1, -1):
b = (val >> bit) & 1
opp = 1 - b
if opp in node:
result |= 1 << bit
node = node[opp]
else:
node = node[b]
return result
2. DFS with backtracking (online/offline on tree paths)
The critical addition here is that the set of numbers in the trie changes as you traverse the tree. When you enter a node, insert it. When you leave, remove it. This pattern shows up in any problem where you need a data structure that reflects the current root-to-node path during a DFS.
stack = [(root, False)]
while stack:
u, leaving = stack.pop()
if leaving:
remove_from_structure(u)
continue
add_to_structure(u)
stack.append((u, True))
process(u)
for child in children[u]:
stack.append((child, False))
The (node, True) cleanup entry on the stack guarantees that the removal happens after all descendants are processed, which is equivalent to the "after recursive call returns" step in a recursive DFS.
Edge cases
- Single node tree: the root is the only node. The path is just
[root], and every query answer isval XOR root. - Query at the root: the path contains only the root. The trie holds a single number.
- val equals a node on the path:
val XOR val = 0, which might not be the maximum. The trie still checks all path nodes, so the answer could come from a different node. - Deep linear chain (skewed tree): the trie holds up to n entries at the deepest node, but each insert and remove is still O(L). The stack depth is O(n), which is fine for iterative DFS.
- Multiple queries at the same node: all are answered together when the DFS visits that node, with the same trie state.
From understanding to recall
You now understand the two-part structure: a bitwise trie for XOR maximization and an iterative DFS that maintains the trie to match the current root-to-node path. The reference-counted insert/remove keeps the trie in sync with the DFS state, and grouping queries by node ensures each query is answered at exactly the right moment.
But understanding is not enough for interview performance. This problem combines three moving parts (trie operations, DFS traversal with backtracking, and query grouping) that need to work together correctly. Under time pressure, it is easy to forget the cleanup entry on the stack, mix up the bit direction, or lose track of the reference count pruning.
Spaced repetition builds that recall. You write the solution from scratch, revisit it a few days later, and again after a week. Each repetition reinforces the connection between "XOR queries on tree paths" and "DFS with a live trie." After a few rounds, the pattern is automatic.
Related posts
- Implement Trie - The foundational trie data structure. If you have not built a trie from scratch, start here.
- Maximum XOR of Two Numbers in an Array - The core bitwise trie XOR maximization technique that this problem builds on.
- Design Add and Search Words - Another trie problem that combines the data structure with DFS backtracking for wildcard matching.
CodeBricks breaks the maximum genetic difference query problem into its bitwise trie and DFS-with-backtracking building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a tree path XOR problem shows up in your interview, you do not think about it. You just write it.