Maximum Length of Pair Chain: Greedy Interval Selection
LeetCode 646: Maximum Length of Pair Chain gives you an array of pairs where each pair [a, b] satisfies a < b. You can build a chain of pairs where for consecutive pairs [c, d] and [e, f], d < e (the second element of one pair is strictly less than the first element of the next). Your goal is to find the longest such chain. This is a classic interval scheduling problem disguised as a pair selection puzzle, and the greedy solution handles it cleanly.
Why this problem matters
At first glance, this looks like it might need dynamic programming. You are selecting a subset of items that satisfy an ordering constraint, which sounds like the Longest Increasing Subsequence pattern. And indeed, a DP solution works here in O(n^2) time. But the greedy approach does it in O(n log n), and understanding why reveals something deeper about interval problems.
The key observation is that each pair [a, b] is really an interval on the number line. The chain condition (d < e) means consecutive intervals cannot overlap. So "find the longest chain" is the same as "find the maximum number of non-overlapping intervals." That is exactly the activity selection problem, which has a well-known greedy solution: sort by end value and greedily pick pairs that do not conflict.
If you have already solved Non-overlapping Intervals or Minimum Number of Arrows to Burst Balloons, you will recognize the same core technique here. The only difference is what you are counting: those problems count removals or arrows, while this one counts the maximum chain length directly.
The approach
Sort the pairs by their second element. Then scan left to right, maintaining the end value of the last pair you added to the chain. For each new pair, if its first element is strictly greater than the current end, add it to the chain and update the end. Otherwise, skip it.
Why sort by the second element? Because the greedy strategy is "always pick the pair that finishes earliest." A pair that ends at a smaller value leaves more room for future pairs to fit into the chain. Sorting by the second element ensures we always encounter the best candidate first.
def findLongestChain(pairs):
pairs.sort(key=lambda x: x[1])
chain_len = 0
end = float('-inf')
for a, b in pairs:
if a > end:
chain_len += 1
end = b
return chain_len
Notice the condition is a > end (strictly greater than), not a >= end. The problem requires d < e for consecutive chain pairs, so the first element of the next pair must be strictly larger than the second element of the previous pair.
Step-by-step walkthrough
Let's trace through the algorithm on [[1,2],[7,8],[4,5],[2,3]]. After sorting by the second element, we get [[1,2],[2,3],[4,5],[7,8]]. Now we scan left to right, picking pairs whose first element is strictly greater than the end of the last selected pair.
Step 0: Sort pairs by second element. Sorted: [[1,2],[2,3],[4,5],[7,8]]
Sorting by the second element of each pair ensures we always consider the pair that "ends" earliest first.
Step 1: Pick [1,2]. Set end = 2. Chain length = 1.
The first pair always goes into the chain. We track 2 as the end boundary.
Step 2: Check [2,3]. Its first element (2) is not strictly greater than end (2). Skip it.
For pair [a,b] to follow pair [c,d] in the chain, we need d < a (strictly less than). Here 2 is not greater than 2, so this pair cannot extend the chain.
Step 3: Check [4,5]. Its first element (4) is strictly greater than end (2). Pick it. Set end = 5. Chain length = 2.
4 > 2, so [4,5] can follow [1,2] in the chain. Update end to 5.
Step 4: Check [7,8]. Its first element (7) is strictly greater than end (5). Pick it. Set end = 8. Chain length = 3.
7 > 5, so [7,8] extends the chain. Final chain: [1,2] then [4,5] then [7,8]. Answer: 3.
The algorithm builds a chain of length 3: [1,2], [4,5], [7,8]. The pair [2,3] gets skipped because its first element (2) is not strictly greater than the end boundary (2). This is the longest possible chain for this input.
Notice how the end boundary acts as a gatekeeper. Any pair whose first element does not clear the boundary cannot follow the last selected pair. Since we sorted by the second element, the boundary is always as small as possible, giving future pairs the best chance of fitting into the chain.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| DP (sort + pairwise comparison) | O(n^2) | O(n) |
| Sort by end + greedy scan | O(n log n) | O(1) |
The sort takes O(n log n). The greedy scan is O(n). Space is O(1) beyond the input (or O(n) if the sort is not in-place). Python's Timsort uses O(n) auxiliary space, but the algorithm itself only needs a counter and a single variable for the end boundary.
Edge cases to watch for
- Single pair. The chain length is always 1. There is nothing to compare against.
- All pairs overlap. Something like
[[1,5],[2,6],[3,7]]sorted by end gives[[1,5],[2,6],[3,7]]. Only the first pair gets selected because every subsequent pair starts before the current end. Answer: 1. - All pairs chainable. If every pair's first element is strictly greater than the previous pair's second element (after sorting), the entire array forms a valid chain. Answer: n.
- Pairs with the same second element.
[[1,3],[2,3],[4,5]]sorted by end gives[[1,3],[2,3],[4,5]](or[[2,3],[1,3],[4,5]]). Either way, the greedy picks one of the pairs ending at 3, then picks[4,5]. The tie-breaking order does not matter because only one pair ending at 3 can be in the chain. - Negative values. The algorithm handles negative numbers without modification. Initializing
endto negative infinity covers this. - Strict inequality matters.
[1,2]followed by[2,3]is not valid because2 < 2is false. You need the next pair's first element to be strictly greater, not equal.
The building blocks
Greedy interval scheduling
The core pattern is activity selection: sort by end value, then greedily pick items that do not conflict with the last selected item. This pattern appears whenever you need to maximize the count of non-overlapping activities, whether those are meetings, tasks, intervals, or pairs. The proof of correctness uses an exchange argument: if an optimal solution skips the earliest-finishing pair, you can swap it in without reducing the chain length.
Connection to dynamic programming
This problem can also be solved with DP. Sort by the first element, then for each pair compute the longest chain ending at that pair by checking all previous pairs. This gives O(n^2) time, similar to Longest Increasing Subsequence. The greedy approach is faster because it exploits the structure of the problem: you never need to reconsider past decisions when you always pick the earliest-ending valid pair.
From understanding to recall
You have seen the greedy interval selection pattern applied to pair chains: sort by the second element, scan left to right, and pick pairs whose first element strictly exceeds the current end boundary. The critical detail is the strict inequality. Unlike Non-overlapping Intervals where start >= end means no overlap, this problem requires start > end because the chain condition is d < e, not d <= e. That one-character difference between > and >= is exactly the kind of thing that trips people up under interview pressure.
Practicing this problem with spaced repetition helps you internalize both the algorithm and the subtle constraint. When you see it again in a few weeks, you should be able to write the solution, explain why sorting by end value works, and handle the strict-vs-non-strict comparison without second-guessing yourself.
Related posts
- Non-overlapping Intervals - The classic interval scheduling problem: sort by end time and remove minimum overlapping intervals
- Minimum Number of Arrows to Burst Balloons - Another greedy interval problem where you count the minimum number of points to cover all intervals
- Longest Increasing Subsequence - The DP approach to this problem mirrors LIS, making it a useful comparison for understanding when greedy beats DP