Non-overlapping Intervals: Greedy Interval Scheduling
LeetCode 435, Non-overlapping Intervals, gives you an array of intervals and asks: what is the minimum number of intervals you need to remove so that the remaining ones do not overlap? This is the classic interval scheduling problem from algorithm design courses, and the greedy solution is one of the cleanest examples of why greedy works.
Why this problem matters
On the surface this looks like it could require dynamic programming or backtracking. You need to find an optimal subset of intervals to keep, which sounds like a combinatorial optimization problem. But the greedy approach solves it in O(n log n) with a few lines of code.
The key insight is that this is equivalent to the activity selection problem: maximize the number of non-overlapping intervals you can keep, and the removals are whatever is left over. Activity selection has a well-known greedy solution: always pick the interval that ends earliest. That choice leaves the most room for future intervals.
This problem also connects directly to the other interval problems you may have already seen. If you have solved Merge Intervals or Insert Interval, you already know how to detect overlaps. The twist here is that instead of merging overlapping intervals, you count and discard them.
The approach
Sort the intervals by end time. Then scan through them, keeping track of the end of the last interval you decided to keep. For each new interval, if its start is before the current end, it overlaps and must be removed. If its start is at or after the current end, keep it and update the end boundary.
Why sort by end time instead of start time? Because the greedy strategy is "always keep the interval that finishes earliest." An interval that finishes early leaves the most room for subsequent intervals. Sorting by end time ensures we always consider the best candidate first.
def eraseOverlapIntervals(intervals):
intervals.sort(key=lambda x: x[1])
removals = 0
end = float('-inf')
for start, finish in intervals:
if start < end:
removals += 1
else:
end = finish
return removals
Step-by-step walkthrough
Let's trace through the algorithm on [[1,2],[2,3],[3,4],[1,3]]. After sorting by end time, we get [[1,2],[2,3],[1,3],[3,4]]. Notice that [1,3] and [2,3] share the same end time, so either order between them is fine.
Step 0: Sort by end time. Sorted: [[1,2],[2,3],[1,3],[3,4]]
Sorting by end time is the key to the greedy approach. It lets us always pick the interval that finishes earliest.
Step 1: Take [1,2]. Set end = 2.
Keep [1,2]. It is the first interval, so there is nothing to conflict with. removals = 0.
Step 2: Check [2,3]. Start (2) >= end (2), so no overlap. Keep it. Set end = 3.
Keep [2,3]. Its start is not before the current end boundary, so it fits. removals = 0.
Step 3: Check [1,3]. Start (1) < end (3), so it overlaps. Remove it.
Remove [1,3]. It starts at 1, which is before our end boundary of 3. removals = 1.
Step 4: Check [3,4]. Start (3) >= end (3), so no overlap. Keep it. Set end = 4.
Keep [3,4]. Final answer: 1 removal. The remaining intervals [[1,2],[2,3],[3,4]] do not overlap.
The algorithm keeps three intervals and removes one. The removed interval [1,3] was the "greediest" in terms of occupying space. It spans from 1 to 3, blocking both [1,2] and [2,3]. By removing it, the other three intervals fit together without any overlaps.
Notice how the end boundary acts as a fence. Any interval whose start falls before the fence overlaps with something we already committed to keeping. Since we sorted by end time, the fence always represents the earliest possible end, which gives future intervals the best chance of fitting.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Brute force (try all subsets) | O(2^n) | O(n) |
| Sort by end time + greedy scan | O(n log n) | O(1) |
The sort takes O(n log n). The scan is O(n). Space is O(1) beyond the input (or O(n) if the sort is not in-place, depending on the language). Python's Timsort uses O(n) auxiliary space, but the algorithm itself only needs a counter and a single variable for the end boundary.
Edge cases to watch for
- No intervals or one interval. Zero removals needed. There is nothing to conflict with.
- All intervals overlap. Something like
[[1,4],[2,5],[3,6]]sorted by end gives[[1,4],[2,5],[3,6]]. You keep the first and remove the rest. Answer: n - 1 in the worst case, but only if every subsequent interval starts before the previous one ends. - Intervals that share an endpoint.
[1,2]and[2,3]do not overlap sincestart >= end(2 is not less than 2). The problem defines overlapping as one interval starting strictly before another ends. - Identical intervals.
[[1,3],[1,3],[1,3]]requires removing all but one. The greedy scan keeps the first and removes the duplicates. - Negative coordinates. The algorithm works without modification since it only compares values. Initializing
endto negative infinity handles this.
The building blocks
Greedy interval scheduling
The core pattern here is the activity selection greedy: sort by end time, then greedily keep intervals that do not conflict with the last kept interval. This pattern appears whenever you need to maximize the number of non-overlapping activities, whether those are meetings, tasks, or ranges on a number line. The proof of correctness relies on an exchange argument: if the optimal solution skips the earliest-finishing interval, you can swap it in without reducing the count.
Sort-first overlap detection
Sorting intervals transforms the overlap detection problem from O(n^2) pairwise comparisons into a linear scan. Once sorted, you only need to compare each interval against a single boundary value. This same principle drives Merge Intervals (sort by start, sweep and merge) and Insert Interval (input is already sorted, sweep in three phases). The difference is what you do when you find an overlap: merge it, insert around it, or count and skip it.
From understanding to recall
You have seen the greedy interval scheduling pattern: sort by end time, scan left to right, keep intervals that start at or after the current end boundary, and count the rest as removals. The logic is compact, but the "why" behind sorting by end time (not start time) is the detail that trips people up during interviews. If you sort by start time, you do not know which of two overlapping intervals to keep without looking ahead. Sorting by end time eliminates that ambiguity because the earliest-finishing interval is always the best greedy choice.
Drilling this problem with spaced repetition locks in both the algorithm and the reasoning. When you see it again in two weeks, you should be able to explain why end-time sorting works, write the solution, and handle the edge cases without hesitation.
Related posts
- Merge Intervals - The foundational interval problem: sort by start time and merge overlapping ranges
- Insert Interval - Insert a new interval into a sorted list, merging overlaps in a single pass
- Meeting Rooms II - Count maximum concurrent overlaps using a sweep line with a heap