Skip to content
← All posts

Maximum Genetic Difference Query: Trie + DFS

8 min read
leetcodeproblemhardtriebit-manipulation

LeetCode Maximum Genetic Difference Query (Problem 1938) asks you to answer XOR queries on tree paths. You are given a rooted tree and a list of queries. Each query says: "at this node, what is the maximum XOR between some value and any node on the path from the root to this node?" The trick is that the tree can have up to 10^5 nodes and 3 * 10^4 queries, so brute force is out of the question. You need a data structure that evolves as you traverse the tree.

The problem

You are given an array parents where parents[i] is the parent of node i. The root is the node where parents[i] == -1. You are also given a 2D array queries where queries[j] = [node_j, val_j]. For each query, find the maximum genetic difference, defined as val_j XOR x where x is any node label on the path from the root to node_j (inclusive). Return the answer for each query.

# Example
parents = [-1, 0, 0, 2]
queries = [[1, 2], [3, 5]]

# Tree:
#       0 (root)
#      / \
#     1   2
#         |
#         3

# Query (node=1, val=2): path = [0, 1]
#   2 XOR 0 = 2, 2 XOR 1 = 3 -> max = 3

# Query (node=3, val=5): path = [0, 2, 3]
#   5 XOR 0 = 5, 5 XOR 2 = 7, 5 XOR 3 = 6 -> max = 7

# Output: [3, 7]
parents = [-1, 0, 0, 2]0root123query nodequery: node=3, val=5bit 2bit 1bit 05^0101= 55^2111= 75^3110= 6max genetic difference = 7(5 XOR 2, from node 2 on the path)
For query (node=3, val=5), the path from root to node 3 is [0, 2, 3]. XOR val with each node on the path and take the maximum.

Why brute force fails

The naive approach is simple: for each query (node, val), walk from the node up to the root, XOR val with every node on the path, and take the maximum. If the tree is a straight chain of n nodes, the path length is O(n). With q queries, the total work is O(n * q), which hits 3 * 10^9 for the given constraints. That is far too slow.

Even if you precompute the path for each node, you still need to XOR val with every node on that path. The problem is not finding the path. The problem is answering the XOR query efficiently over an arbitrary set of numbers that changes as you move through the tree.

The key insight: trie on the DFS path

The key observation combines two ideas. First, a binary trie can answer "what is the maximum XOR of this value with any number in the trie?" in O(L) time, where L is the number of bits (typically 18 or 20 for this problem's constraints). Second, a DFS traversal visits every root-to-node path exactly once. As you enter a node during DFS, the current call stack holds exactly the path from the root to that node.

The algorithm works like this:

  1. Build a bitwise trie that supports insert with reference counting and delete (decrement count).
  2. Run a DFS from the root. When you enter a node, insert that node's label into the trie.
  3. While at a node, answer all queries associated with that node by querying the trie for maximum XOR with the query's val.
  4. When the DFS backtracks out of a node, remove that node's label from the trie.

At any point during the DFS, the trie contains exactly the node labels on the current root-to-node path. That is precisely the set of numbers each query needs to search over. By grouping queries by their target node, you answer each query at exactly the right moment.

Step-by-step walkthrough

Here is the DFS in action on the small example above, with queries (node=1, val=2) and (node=3, val=5).

Step 1: DFS enters node 0. Insert 0 (000) into trie.

0visiting123bitwise trie (3-bit)*000000= 0

Trie now contains {0}. No queries at node 0, so continue DFS to children.

Step 2: DFS enters node 1. Insert 1 (001) into trie. Answer query (node=1, val=2).

01query: val=223trie contains: 0 (000), 1 (001)*000000= 011= 1val = 2 (010)greedy XORpicks node 12 ^ 1 = 3

Trie has {0, 1}. Path to node 1 is [0, 1]. Query max XOR of 2 with path nodes: 2^0=2, 2^1=3. The trie finds 3 greedily. Answer = 3.

Step 3: Backtrack from node 1 (remove 1 from trie). DFS enters node 2. Insert 2 (010).

012visiting3trie contains: 0 (000), 2 (010)*000000= 01100= 2node 1 removednode 2 insertedno queries here

Removing node 1's bits from the trie restores it to just {0}. Then inserting 2 (010) adds a new branch. Trie now has {0, 2}.

Step 4: DFS enters node 3. Insert 3 (011). Answer query (node=3, val=5).

0123query: val=5trie: 0 (000), 2 (010), 3 (011)*00001100= 000= 211= 3val = 5 (101)bit 2: val=1, go 0bit 1: val=0, go 1bit 0: val=1, go 0picks 010 = 25 ^ 2 = 7

Trie has {0, 2, 3}. The path to node 3 is [0, 2, 3]. For val=5 (101), the trie greedily picks node 2 (010), giving 5^2=7. Answer = 7.

Step 5: DFS backtracks. Remove 3 from trie, then remove 2, then remove 0. Done.

01ans=323ans=7Results:query(1, val=2)= 3query(3, val=5)= 7Each node was inserted on DFS enterand removed on DFS backtrack.Trie always holds exactly the root-to-current path.

As DFS leaves each node, its bits are removed from the trie (decrement counters). All queries have been answered: query(1, 2)=3, query(3, 5)=7.

Notice two things. First, the trie never holds more than O(depth) entries at once. Second, each node is inserted and removed exactly once, so the total trie operations across the entire DFS are O(n * L).

The code

def max_genetic_difference(parents, queries):
    n = len(parents)
    BITS = 18
    root = -1
    children = [[] for _ in range(n)]

    for i, p in enumerate(parents):
        if p == -1:
            root = i
        else:
            children[p].append(i)

    query_map = [[] for _ in range(n)]
    for j, (node, val) in enumerate(queries):
        query_map[node].append((val, j))

    ans = [0] * len(queries)
    trie = {}

    def insert(num):
        node = trie
        for bit in range(BITS, -1, -1):
            b = (num >> bit) & 1
            if b not in node:
                node[b] = {"cnt": 0}
            node[b]["cnt"] += 1
            node = node[b]

    def remove(num):
        node = trie
        for bit in range(BITS, -1, -1):
            b = (num >> bit) & 1
            node[b]["cnt"] -= 1
            if node[b]["cnt"] == 0:
                del node[b]
                return
            node = node[b]

    def query(val):
        node = trie
        result = 0
        for bit in range(BITS, -1, -1):
            b = (val >> bit) & 1
            opposite = 1 - b
            if opposite in node:
                result |= 1 << bit
                node = node[opposite]
            else:
                node = node[b]
        return result

    stack = [(root, False)]
    while stack:
        u, leaving = stack.pop()
        if leaving:
            remove(u)
            continue
        insert(u)
        stack.append((u, True))
        for qval, qidx in query_map[u]:
            ans[qidx] = query(qval)
        for child in children[u]:
            stack.append((child, False))

    return ans
  1. Build the tree. Parse the parents array into an adjacency list (children). Identify the root node.
  2. Group queries by node. For each query (node, val), store (val, original_index) in query_map[node]. This lets you answer queries when the DFS reaches that node.
  3. Trie with reference counts. Each trie node carries a cnt field. Insert increments the count at each level. Remove decrements it and prunes subtrees that reach zero. This avoids leaving dead paths in the trie.
  4. Iterative DFS with a "leaving" flag. Each stack entry is (node, leaving). On the first visit (leaving=False), insert the node into the trie, push a cleanup entry (node, True), answer queries, and push children. When the cleanup entry is popped (leaving=True), remove the node from the trie. This simulates recursive DFS with backtracking.
  5. Greedy trie query. For each query value, walk the trie from the MSB to LSB. At each level, prefer the opposite bit to maximize the XOR result. If the opposite bit does not exist, follow the same bit.

Complexity analysis

Let n be the number of nodes, q the number of queries, and L the number of bits (about 18 for node values up to 2 * 10^5).

ApproachTimeSpace
Brute force (walk path per query)O(n * q)O(n)
Trie + DFSO((n + q) * L)O(n * L)

Each node is inserted into and removed from the trie once during the DFS, costing O(L) per operation, for a total of O(n * L). Each query performs one trie traversal costing O(L), for a total of O(q * L). The trie holds at most O(depth * L) nodes at any time, but across the entire run it allocates at most O(n * L) nodes total.

Building blocks

1. Bitwise trie for maximum XOR

This is the same pattern from Maximum XOR of Two Numbers in an Array. Build a trie where each level represents one bit position, MSB first. To find the maximum XOR with a given value, walk the trie greedily: at each level, try to go opposite to the current bit of the query value. Going opposite sets that bit to 1 in the XOR result, which is always better.

def query(trie, val, bits=18):
    node = trie
    result = 0
    for bit in range(bits, -1, -1):
        b = (val >> bit) & 1
        opp = 1 - b
        if opp in node:
            result |= 1 << bit
            node = node[opp]
        else:
            node = node[b]
    return result

2. DFS with backtracking (online/offline on tree paths)

The critical addition here is that the set of numbers in the trie changes as you traverse the tree. When you enter a node, insert it. When you leave, remove it. This pattern shows up in any problem where you need a data structure that reflects the current root-to-node path during a DFS.

stack = [(root, False)]
while stack:
    u, leaving = stack.pop()
    if leaving:
        remove_from_structure(u)
        continue
    add_to_structure(u)
    stack.append((u, True))
    process(u)
    for child in children[u]:
        stack.append((child, False))

The (node, True) cleanup entry on the stack guarantees that the removal happens after all descendants are processed, which is equivalent to the "after recursive call returns" step in a recursive DFS.

Edge cases

  • Single node tree: the root is the only node. The path is just [root], and every query answer is val XOR root.
  • Query at the root: the path contains only the root. The trie holds a single number.
  • val equals a node on the path: val XOR val = 0, which might not be the maximum. The trie still checks all path nodes, so the answer could come from a different node.
  • Deep linear chain (skewed tree): the trie holds up to n entries at the deepest node, but each insert and remove is still O(L). The stack depth is O(n), which is fine for iterative DFS.
  • Multiple queries at the same node: all are answered together when the DFS visits that node, with the same trie state.

From understanding to recall

You now understand the two-part structure: a bitwise trie for XOR maximization and an iterative DFS that maintains the trie to match the current root-to-node path. The reference-counted insert/remove keeps the trie in sync with the DFS state, and grouping queries by node ensures each query is answered at exactly the right moment.

But understanding is not enough for interview performance. This problem combines three moving parts (trie operations, DFS traversal with backtracking, and query grouping) that need to work together correctly. Under time pressure, it is easy to forget the cleanup entry on the stack, mix up the bit direction, or lose track of the reference count pruning.

Spaced repetition builds that recall. You write the solution from scratch, revisit it a few days later, and again after a week. Each repetition reinforces the connection between "XOR queries on tree paths" and "DFS with a live trie." After a few rounds, the pattern is automatic.

Related posts

CodeBricks breaks the maximum genetic difference query problem into its bitwise trie and DFS-with-backtracking building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a tree path XOR problem shows up in your interview, you do not think about it. You just write it.