Minimum Time to Collect All Apples in a Tree
Given an undirected tree with n nodes rooted at node 0, where each node may or may not have an apple, find the minimum time in seconds you need to collect all apples and return to the root. You start at node 0, and each edge takes 1 second to traverse in either direction.
This is LeetCode 1443: Minimum Time to Collect All Apples in a Tree, and it is one of the cleanest problems for learning how post-order DFS on a tree can aggregate costs bottom-up. The trick is recognizing which edges are necessary and which can be skipped entirely.
The problem
You are given n (the number of nodes), an edges array describing the undirected tree, and a hasApple boolean array. Starting from root node 0, you need to visit every node that has an apple and return to the root. Each edge traversal costs 1 second, so walking down an edge and back costs 2 seconds total. You want to minimize the total time.
The approach
The key observation is that you only need to traverse an edge if the subtree below it contains at least one apple. If a child's entire subtree has no apples, you skip it completely. If it does have apples (either the child itself has one, or some descendant does), you must pay 2 seconds for that edge: 1 second going down, 1 second coming back.
This naturally leads to a post-order DFS. You process all children of a node first, collecting their costs. For each child, if the child's subtree cost is greater than 0 (meaning there are apples deeper down) or the child itself has an apple, you add the child's subtree cost plus 2 (for the edge to that child). Otherwise, you add nothing.
Think of it this way: each edge in the tree either contributes 2 seconds or 0 seconds to the answer. An edge contributes 2 seconds if and only if the subtree below it contains at least one apple. DFS tells you whether each subtree has apples by bubbling costs up from leaves to root.
Step-by-step walkthrough
Let's trace through the example: n = 7, edges = [[0,1],[0,2],[1,4],[1,5],[2,3],[2,6]], hasApple = [false, false, true, false, true, true, false].
Step 1: Start DFS at root (node 0)
Begin post-order DFS. We process children first, then decide whether to include the edge to each child.
Step 2: DFS into node 1, visit leaf 4 (has apple)
Node 4 is a leaf with an apple. DFS returns cost = 0 from node 4. Since the subtree has an apple, add 2 for the edge 1-4. Running cost at node 1: 2.
Step 3: Visit leaf 5 (has apple)
Node 5 also has an apple. Add 2 for the edge 1-5. Running cost at node 1: 2 + 2 = 4.
Step 4: Node 1 returns cost 4 to root
Node 1's subtree cost is 4 (greater than 0), so the subtree has apples. Root adds 2 for the edge 0-1. Running cost at root: 4 + 2 = 6.
Step 5: DFS into node 2, visit children 3 and 6
Node 3 has no apple and no children with apples. Node 6 also has no apple. Both return 0. Node 2 itself has an apple, so its subtree cost is 0 but hasApple[2] = true.
Step 6: Node 2 returns cost 0. Root adds 2 for edge 0-2. Answer: 8
Node 2 has an apple (or its subtree does), so root must traverse edge 0-2. Cost = 6 + 0 + 2 = 8. The answer is 8 seconds.
The code
from collections import defaultdict
def min_time(n: int, edges: list[list[int]], hasApple: list[bool]) -> int:
graph = defaultdict(list)
for u, v in edges:
graph[u].append(v)
graph[v].append(u)
def dfs(node: int, parent: int) -> int:
total = 0
for child in graph[node]:
if child == parent:
continue
child_cost = dfs(child, node)
if child_cost > 0 or hasApple[child]:
total += child_cost + 2
return total
return dfs(0, -1)
The code builds an adjacency list from the edge list, then runs DFS from the root. For each child of the current node (skipping the parent to avoid going back up), it recursively computes the cost of that child's subtree. If the child's cost is positive or the child has an apple, the current node adds child_cost + 2 to its own total. The + 2 accounts for traveling down to the child and back.
The parent parameter prevents the DFS from revisiting the node it came from. Since this is an undirected tree represented as an adjacency list, both directions of each edge appear in the graph. Without the parent check, the DFS would loop forever.
The base case is implicit: leaf nodes have no children (other than their parent), so the loop body never executes and they return 0. If a leaf has an apple, the parent detects it via hasApple[child] and adds 2.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| DFS | O(n) | O(n) |
Time is O(n) because each node is visited exactly once during the DFS traversal. Building the adjacency list takes O(n) as well, since a tree with n nodes has n - 1 edges.
Space is O(n) for the adjacency list and the recursion stack. In the worst case (a skewed tree), the recursion depth is n.
The building blocks
Post-order DFS for bottom-up aggregation
The core pattern here is post-order traversal: process children before the current node. This lets you aggregate information from the leaves up to the root. You see this pattern in many tree problems. In Diameter of Binary Tree, you compute depths bottom-up. In Binary Tree Maximum Path Sum, you compute max gains bottom-up. Here, you compute "does this subtree need visiting" bottom-up.
The template is always the same:
def dfs(node, parent):
result = base_value
for child in graph[node]:
if child == parent:
continue
child_result = dfs(child, node)
# combine child_result into result
return result
Conditional edge inclusion
The second building block is the decision at each edge: include it or skip it. You only include an edge if the subtree below contains something worth visiting. This "prune unnecessary branches" idea appears in many tree problems. For example, in Sum of Distances in Tree, you also reason about subtree properties to compute per-node answers. The general principle is that tree edges act as gates, and you open a gate only when the subtree beyond it has relevant nodes.
Edge cases
- No apples at all: every
hasApplevalue is false. DFS returns 0 from every subtree. The answer is 0. - Apple only at root:
hasApple[0]is true, but no other node has one. You are already at the root, so the answer is 0. You never need to traverse any edge. - Every node has an apple: you must traverse every edge in the tree, each costing 2 seconds. The answer is
2 * (n - 1). - Linear tree (skewed): the tree is a straight chain. If only the deepest node has an apple, the cost is
2 * depthbecause you walk all the way down and all the way back.
From understanding to recall
You have seen the pattern: post-order DFS, bubble up costs, add 2 for each edge whose subtree contains an apple. The logic is clean and the code is short. But in an interview, the details matter. Do you check child_cost > 0 or child_cost >= 0? Do you add 2 before or after the recursive call? Do you need a visited set or is a parent parameter enough?
These are not conceptual gaps. They are recall gaps. Spaced repetition is how you close them. You practice writing the DFS, the conditional edge inclusion, and the parent-tracking pattern from memory at increasing intervals. After a few rounds, you see "minimum cost to visit certain nodes in a tree" and the code flows out without hesitation.
Related posts
- Diameter of Binary Tree - Another post-order DFS problem where you aggregate depths bottom-up and track a global answer
- Sum of Distances in Tree - A harder tree problem that also reasons about subtree sizes and edge contributions
- Course Schedule - Graph traversal with DFS on an adjacency list, using the same "build graph, run DFS" template