Merge BSTs to Create Single BST: Grafting Trees Together
You are given a list of binary search trees. Each tree is small, and leaf nodes in one tree might have the same value as the root of another tree. Your job is to merge all the trees into a single valid BST by grafting trees at matching leaf-root pairs. If no valid merged BST exists, return None.
This is LeetCode 1932: Merge BSTs to Create Single BST. It is rated hard, but the core idea is clean once you see it: use a hash map to find where trees connect, graft them together, and then validate the result.
The approach: leaf-root grafting with BST validation
The key insight is that a leaf node with value v in one tree can be replaced by another tree whose root also has value v. This is the "graft point." Your algorithm needs to do three things: find these graft points efficiently, perform the grafting, and verify that the final tree is a valid BST.
Here is the plan:
- Build a root map. Create a dictionary mapping each tree's root value to the tree itself. This gives you O(1) lookup when you encounter a leaf whose value matches a root.
- Count how many times each root value appears as a leaf. The tree whose root value is never a leaf in any other tree is your starting tree. If there is not exactly one such tree, the answer is
None. - Traverse the starting tree. At each leaf node, check if its value is a key in the root map. If so, replace that leaf with the corresponding tree (graft it in). Remove the entry from the root map so it is not used twice.
- Validate. After all grafts, run a standard BST validation (inorder with range bounds) on the merged tree. Also confirm the root map is empty, meaning every tree was used.
The validation step is essential. Grafting trees together does not guarantee a valid BST. Two individually valid BSTs can produce an invalid result when merged if a grafted subtree has values outside the allowed range at that position.
The code
def can_merge(trees):
root_map = {}
for tree in trees:
root_map[tree.val] = tree
leaf_count = {}
for tree in trees:
count_leaves(tree, leaf_count)
main_root = None
for tree in trees:
if tree.val not in leaf_count:
if main_root is not None:
return None
main_root = tree
if main_root is None:
return None
root_map.pop(main_root.val)
graft(main_root, root_map)
if root_map:
return None
if not is_valid_bst(main_root, float('-inf'), float('inf')):
return None
return main_root
def count_leaves(node, leaf_count):
if node is None:
return
if node.left is None and node.right is None:
leaf_count[node.val] = leaf_count.get(node.val, 0) + 1
return
count_leaves(node.left, leaf_count)
count_leaves(node.right, leaf_count)
def graft(node, root_map):
if node is None:
return
if node.left and node.left.left is None and node.left.right is None:
if node.left.val in root_map:
node.left = root_map.pop(node.left.val)
if node.right and node.right.left is None and node.right.right is None:
if node.right.val in root_map:
node.right = root_map.pop(node.right.val)
graft(node.left, root_map)
graft(node.right, root_map)
def is_valid_bst(node, low, high):
if node is None:
return True
if node.val <= low or node.val >= high:
return False
return (is_valid_bst(node.left, low, node.val) and
is_valid_bst(node.right, node.val, high))
can_mergebuilds theroot_mapfrom root values to trees, then counts leaf values across all trees.- It identifies the main root: the one tree whose root value does not appear as a leaf anywhere. If zero or more than one such tree exists, return
None. - The main root is removed from
root_map, thengraftwalks the tree. At each leaf, if the leaf's value exists inroot_map, the leaf is replaced by the full subtree from that map entry. - After grafting, if
root_mapis not empty, some trees were never used, so returnNone. - Finally,
is_valid_bstdoes a standard range-check traversal. If any node violates its bounds, the merge is invalid.
Visual walkthrough
Here is the grafting process step by step, using three small BSTs. Watch how leaf nodes get replaced by entire subtrees, and how the final validation confirms the BST property.
Step 1: Build a map from root values to their trees.
Store each tree's root value in a hash map so we can quickly look up which tree to graft at a given leaf.
Step 2: Find the main tree whose root is not a leaf in another tree.
Root 3 does not appear as a leaf in any other tree, so Tree 1 is the starting tree. Roots 5 and 2 each appear as a leaf in Tree 1.
Step 3: Replace each matching leaf with the corresponding tree.
Leaf node 2 is replaced by Tree 3 (rooted at 2). Leaf node 5 is replaced by Tree 2 (rooted at 5, with children 4 and 7).
Step 4: Validate the merged tree is a valid BST.
Run an inorder traversal with range bounds. Every node falls within its allowed range, confirming this is a valid BST.
The walkthrough shows the four stages: building the lookup map, identifying the main tree, grafting subtrees at leaf-root matches, and running BST validation with range bounds on the result.
Complexity analysis
| Metric | Value |
|---|---|
| Time | O(n) |
| Space | O(n) |
Here n is the total number of nodes across all trees. Building the root map is O(t) where t is the number of trees. Counting leaves, grafting, and BST validation each visit every node once, giving O(n) total. The hash map and recursion stack together use O(n) space.
Building blocks
Leaf-root matching with a hash map
The core trick in this problem is using a hash map to connect separate data structures at matching values. You build a dictionary keyed by root values, then walk through another structure looking for matches. This "index one structure, scan another" pattern appears whenever you need to find connections between two collections.
- Accounts Merge (LeetCode 721): use a hash map to link emails to their owner, then union-find or DFS to merge connected accounts.
- Clone Graph (LeetCode 133): use a hash map from original nodes to cloned nodes, then walk the original graph and wire up the clones using the map.
BST validation with range bounds
After grafting, you need to confirm the result is a valid BST. The range-checking approach passes (low, high) bounds down the recursion. When you go left, the upper bound tightens to the current node's value. When you go right, the lower bound tightens. This "narrow the constraints as you recurse" technique is a fundamental BST tool.
- Validate Binary Search Tree (LeetCode 98): the direct application of range-bound validation on a single tree.
- Recover Binary Search Tree (LeetCode 99): use inorder traversal to find two nodes that violate BST order, relying on the same sorted-order property that range checking verifies.
Edge cases
Single tree in the input: if there is only one tree and it is already a valid BST, return it as-is. No grafting needed.
Duplicate root values across trees: if two trees share the same root value, the root map will only store one of them. The other tree can never be grafted in, so root_map will be non-empty at the end and the answer is None.
Grafting creates an invalid BST: two individually valid BSTs can merge into an invalid result. For example, grafting a tree [4, 3, 6] at a position where the valid range is (5, inf) produces an invalid node 4. The final validation catches this.
Cycle or unused trees: if not every tree gets consumed during grafting, some trees form a disconnected component. The check if root_map after grafting catches this and returns None.
From understanding to recall
You now see how this problem breaks down into a few clean steps: build a map, find the main tree, graft at matching leaves, and validate. Each of those steps uses a pattern you have probably seen before. The hash map lookup is the same indexing pattern from two-sum. The BST validation is the same range-checking pattern from Validate BST.
The challenge in an interview is not understanding the approach. It is remembering all the pieces and assembling them under pressure. You need to recall that you must count leaf values to find the main root, that you must remove entries from the map during grafting to avoid reuse, and that you must validate after merging. Missing any one of these details leads to a wrong answer.
Spaced repetition locks in these details. You practice writing the full solution from scratch at increasing intervals. Each repetition reinforces the sequence of steps and the edge cases. After a few rounds, you stop pausing to remember "do I check the map is empty?" because the check is automatic. That is the difference between understanding a problem once and being able to solve it reliably.
Related posts
- Validate Binary Search Tree - The range-bound validation technique used in the final step of this problem
- Binary Search Tree Iterator - Another problem that requires deep understanding of BST traversal order and structure