Lowest Common Ancestor: Post-Order Tree Logic
You are given a binary tree and two nodes, p and q. Find their lowest common ancestor (LCA), which is the deepest node in the tree that has both p and q as descendants. A node is allowed to be a descendant of itself.
This is LeetCode 236: Lowest Common Ancestor of a Binary Tree, and it is one of those problems that seems tricky until you see the recursive solution. Then it clicks, and you wonder how it could possibly be so short.
Why this problem matters
The lowest common ancestor shows up everywhere. In version control, finding the common ancestor of two branches is literally this problem. In compilers, the LCA of two nodes in a syntax tree tells you their innermost shared scope. In biology, the LCA of two species in a phylogenetic tree tells you their most recent common ancestor.
On LeetCode, this problem is also a stepping stone. Once you understand how to find the LCA in a general binary tree, you can solve the BST variant (LeetCode 235) in your sleep, and you will have a deeper understanding of post-order traversal that helps with problems like diameter of binary tree, path sum, and balanced binary tree.
The key insight
Here is the question you need to ask: "How do I know if a node is the LCA?"
A node is the LCA of p and q if one of two things is true:
- The node is
porq, and the other target is somewhere in its subtree. For example, if the node ispandqis one of its descendants, then this node is the LCA. - One target is in its left subtree and the other is in its right subtree. This means the node sits right at the "fork" where the paths to
pandqdiverge. That fork is the deepest node that has both as descendants.
The trick is that you do not need to know which case you are in ahead of time. The recursion figures it out for you. You search both subtrees, collect the results, and the answer reveals itself at the right node.
This is what makes post-order traversal so powerful for this problem. You go all the way down first, then make your decision on the way back up. By the time you are deciding what to return at any given node, you already have the complete picture from both subtrees.
Why post-order works
Most people learning tree recursion are familiar with pre-order (do something, then recurse) and maybe inorder. Post-order is the one that trips people up because the "work" happens after the recursive calls, not before.
But post-order is exactly what you need when the answer at a node depends on information from its children. You cannot decide if a node is the LCA until you know whether p and q were found below it. So you have to recurse first, get answers from both children, and then aggregate.
This "recurse first, decide later" pattern is called post-order aggregation. It shows up whenever you need to build an answer from the bottom of the tree upward. Maximum depth uses it (you need child depths before computing yours). Balanced binary tree uses it (you need child heights before checking the balance condition). And lowest common ancestor uses it too.
The recursive solution
def lowest_common_ancestor(root, p, q):
if root is None:
return None
if root == p or root == q:
return root
left = lowest_common_ancestor(root.left, p, q)
right = lowest_common_ancestor(root.right, p, q)
if left and right:
return root
return left if left else right
That is the entire solution. Let's break down each piece:
if root is None: return Noneis the base case. If we have gone past a leaf, there is nothing to find here.if root == p or root == q: return rootis the early return. If we find one of our targets, we return it immediately. We do not need to search deeper because the problem guarantees bothpandqexist in the tree.left = lowest_common_ancestor(root.left, p, q)searches the entire left subtree.right = lowest_common_ancestor(root.right, p, q)searches the entire right subtree.if left and right: return rootis the key line. If both subtrees returned a non-null result, that meanspis in one subtree andqis in the other. The current node is their lowest common ancestor.return left if left else righthandles the case where only one subtree found something. We just pass that result up. Either both targets are in the same subtree (and the LCA was already found deeper down), or only one has been found so far.
The early return when we find p or q is what makes this work. We do not recurse past a target node. Why? Because if the current node is p, and q is somewhere below it, then this node is already the LCA. If q is NOT below it, q must be in a sibling subtree, and the LCA will be found higher up. Either way, returning immediately is correct.
Visual walkthrough
Let's trace the recursion through the tree [3, 5, 1, 6, 2, 0, 8, null, null, 7, 4] with p = 5 and q = 1. Watch how the recursion goes down first, and the LCA decision happens on the way back up.
Step 1: Call lca(3, p=5, q=1). Recurse left into node 5.
Root is not p or q, so we recurse into left and right subtrees. Start with the left.
Step 2: lca(5). Node 5 equals p. Return 5 immediately.
We found one of our targets. Return this node right away. No need to search deeper into 5's subtree.
Step 3: Back at node 3. Left returned 5. Now recurse right into node 1.
Left subtree found p. Now we check the right subtree for q.
Step 4: lca(1). Node 1 equals q. Return 1 immediately.
Found the other target. Return this node back up to the parent.
Step 5: Back at root. Left = 5, Right = 1. Both non-null. Node 3 is the LCA!
Both sides returned non-null. p is in one subtree, q is in the other. This node is their lowest common ancestor.
Notice the flow. We recurse left and find p (node 5) immediately. We recurse right and find q (node 1) immediately. Back at the root, both sides returned non-null, so the root (node 3) is the LCA.
Now think about the case where p = 5 and q = 4 (where q is a descendant of p). When we reach node 5, we match p and return immediately without going deeper. Node 4 is never visited. Back at the root, left returns 5 and right returns None. Since only one side has a result, we pass it up. The answer is 5, which is correct because 5 is an ancestor of 4 (and a node counts as its own descendant).
Why the early return is safe
This is the part that confuses people the most. "If we return at node 5 without searching deeper, how do we know q = 4 is actually in the tree?"
The problem guarantees that both p and q exist in the tree. Given that guarantee, if we find p at some node and q is NOT in p's subtree, then q must be somewhere else in the tree. The LCA would be an ancestor of p, and that ancestor will eventually see a non-null return from p's side plus a non-null return from q's side. The right answer bubbles up no matter what.
If q IS in p's subtree, then p itself is the LCA. We returned p, and that is the correct answer.
Either way, the early return works. The guarantee that both nodes exist is doing the heavy lifting.
What if the problem did not guarantee that both nodes exist? Then you would need to search the entire tree (no early return) and check that both p and q were actually found. The solution would be more complex. The guarantee lets us write this elegant short-circuit version.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Post-order DFS | O(n) | O(h) call stack |
Time is O(n) in the worst case. We might visit every node in the tree. In practice, we often short-circuit early (when we find p or q), but the worst case is when the two targets are deep leaves far apart.
Space is O(h) where h is the height of the tree, due to the recursive call stack. For a balanced tree, h = log n. For a completely skewed tree (basically a linked list), h = n.
There is no need for extra data structures. No hash maps, no parent pointers, no queues. Just pure recursion.
Edge cases to watch for
- p is an ancestor of q (or vice versa): The answer is the ancestor node itself. The early return handles this.
- p and q are the same node: The first match returns that node. The LCA of a node with itself is itself.
- p and q are siblings: The parent node gets non-null from both sides and returns itself.
- Tree has only two nodes: Root and one child. One of them is p, the other is q, and the root is the LCA.
- p and q are both deep leaves in opposite subtrees: The recursion goes all the way down both sides before finding the fork. This is the worst case for time.
The building blocks
This problem is built on two fundamental building blocks:
1. Post-order aggregation (post-order-aggregation). The decision at each node depends on the results from both children. You recurse first, then aggregate. The skeleton looks like this:
def solve(node):
if node is None:
return BASE_VALUE
left = solve(node.left)
right = solve(node.right)
return AGGREGATE(node, left, right)
For lowest common ancestor, the aggregation logic is: if both sides returned non-null, this node is the answer. If only one side returned non-null, pass it up. This "aggregate from children" pattern is the core of post-order traversal.
2. Recursive null base case (recursive-null-base). The base case if node is None: return None terminates the recursion at leaves. This is the same pattern you see in maximum depth, invert binary tree, and every other recursive tree solution. The value returned (None here, 0 for depth, True for validation) changes per problem, but the structure is always the same.
The combination of these two bricks gives you a powerful template. Any problem where the answer at a node requires information from its subtrees uses post-order aggregation on top of the recursive null base case.
From understanding to recall
You just read a short recursive solution and traced through two different scenarios. You understand why the early return works and how post-order aggregation bubbles the LCA up to the right node. It all makes sense right now.
But when you see this problem in an interview next month, will you remember the exact structure? Will you remember that you return root when both sides are non-null? Will you remember the early return when you find p or q? Or will you second-guess yourself and try to build something more complicated?
Understanding is not recall. You understood the solution. Recall means you can write it cold, from memory, under time pressure.
Spaced repetition builds recall. You practice typing the solution from scratch at increasing intervals. First after a day, then three days, then a week. Each repetition strengthens the neural pathway until the post-order aggregation pattern and the null base case are automatic. You do not think about them. You just write them.
These two building blocks (post-order-aggregation and recursive-null-base) appear across dozens of tree problems. Drilling them individually is far more effective than grinding random LeetCode problems and hoping the patterns stick.
Related posts
- Maximum Depth of Binary Tree - Uses the same post-order aggregation pattern to compute depth from child results
- Invert Binary Tree - Another recursive tree solution built on the null base case pattern, but with pre-order processing instead of post-order
- Validate BST - A tree problem that passes constraints downward (pre-order) rather than aggregating upward (post-order), showing the contrast between the two approaches