The Skyline Problem: Sweep Line with a Max Heap
The Skyline Problem (LeetCode #218) asks you to compute the outline of a city skyline formed by overlapping rectangular buildings. You are given a list of buildings, each defined by [left, right, height], and you need to return a list of "key points" where the skyline changes height.
This is a classic sweep line problem. Instead of reasoning about every pair of buildings or every pixel along the x-axis, you process events from left to right and use a max heap to track which buildings are currently active. Whenever the tallest active building changes, you emit a key point.
The problem
Given buildings = [[2,9,10],[3,7,15],[5,12,12],[15,20,10],[19,24,8]], the expected output is [[2,10],[3,15],[7,12],[12,0],[15,10],[20,8],[24,0]].
Each key point in the output represents a position where the skyline height changes. The bold line traces the outline you would see looking at these buildings from a distance. Your job is to compute the coordinates of every corner along that outline.
The approach: sweep line with a max heap
The core idea is to convert the 2D rectangle problem into a 1D event processing problem:
- Create events. For each building
[left, right, height], create two events: a "start" event atx = leftwith the building's height, and an "end" event atx = right. - Sort events by x-coordinate. When two events share the same x, process starts before ends (to avoid briefly dropping to a lower height), and process taller starts first.
- Use a max heap to track heights of all currently active buildings. When a building starts, push its height. When a building ends, remove its height.
- Emit key points whenever the current maximum height changes compared to the previous maximum.
The trick with Python's heapq (which is a min heap) is to negate heights so the smallest negative value corresponds to the tallest building.
The solution
import heapq
def getSkyline(buildings: list[list[int]]) -> list[list[int]]:
events = []
for left, right, height in buildings:
events.append((left, -height, right))
events.append((right, 0, 0))
events.sort()
result = [[0, 0]]
heap = [(0, float('inf'))]
for x, neg_h, right in events:
while heap[0][1] <= x:
heapq.heappop(heap)
if neg_h:
heapq.heappush(heap, (neg_h, right))
if result[-1][1] != -heap[0][0]:
result.append([x, -heap[0][0]])
return result[1:]
Let's break down how this works. Each building generates two events: (left, -height, right) for the start and (right, 0, 0) for the end. Sorting these tuples lexicographically gives us the correct processing order. Start events have negative heights, so they sort before end events at the same x (since -height is negative and 0 is not).
The heap stores (neg_height, right_edge) pairs. Before processing each event, we lazily remove any buildings from the top of the heap whose right edge has passed. This "lazy deletion" avoids the cost of searching for and removing specific entries, which would be expensive in a heap.
After updating the heap, we check if the current maximum height (the negation of heap[0][0]) differs from the last emitted height. If it does, we have a new key point.
Visual walkthrough
Let's trace through the algorithm on buildings = [[2,9,10],[3,7,15],[5,12,12],[15,20,10],[19,24,8]]. Watch how the max heap tracks active buildings and emits key points when the maximum height changes.
Step 1: x=2, building [2,9,10] starts. Push height 10 onto heap.
Previous max was 0, now 10. Height changed, so emit key point (2, 10).
Step 2: x=3, building [3,7,15] starts. Push height 15 onto heap.
Max height jumps from 10 to 15. Emit key point (3, 15).
Step 3: x=5, building [5,12,12] starts. Push height 12. x=7, building [3,7,15] ends.
At x=5, push 12, but max is still 15 (no emit). At x=7, building [3,7,15] expires. Max drops to 12. Emit (7, 12).
Step 4: x=9, building [2,9,10] ends. x=12, building [5,12,12] ends.
At x=9, building [2,9,10] expires. Max stays 12 (no emit). At x=12, last active building ends. Max drops to 0. Emit (12, 0).
Step 5: x=15 start [15,20,10], x=19 start [19,24,8], x=20 end, x=24 end.
At x=15, push 10, emit (15,10). At x=19, push 8, max still 10. At x=20, building [15,20,10] ends, max drops to 8, emit (20,8). At x=24, last building ends, emit (24,0).
The critical insight is that you only emit a key point when the maximum height changes. Multiple buildings can start or end without affecting the skyline if a taller building is already active. For example, when building [5,12,12] starts at x=5, the tallest active building is still [3,7,15] with height 15, so no key point is emitted.
Why lazy deletion works
In a standard heap, removing an arbitrary element is O(n) because you need to find it first. Lazy deletion sidesteps this entirely. Instead of removing a building when it ends, we leave it in the heap and skip it later when it bubbles to the top but its right edge has already passed.
The while heap[0][1] <= x loop handles this. It checks whether the building at the top of the heap has already expired (its right edge is at or before the current x-coordinate). If so, pop it and check the next one. This is amortized O(1) per event because each building is pushed and popped exactly once across the entire algorithm.
Lazy deletion is a common pattern in heap-based sweep line problems. You will see it again in problems like Merge K Sorted Lists and Task Scheduler variants.
Complexity analysis
| Metric | Value | Explanation |
|---|---|---|
| Time | O(n log n) | Sorting 2n events takes O(n log n). Each event involves at most one heap push and one heap pop, each O(log n). Total: O(n log n). |
| Space | O(n) | The heap and event list each hold at most 2n entries. The result list holds at most 2n key points. |
The building blocks
The Skyline Problem combines three powerful techniques that appear across many algorithm problems.
Sweep line technique
Convert a 2D problem into a sequence of 1D events sorted by position. Process events left to right while maintaining some state (here, a heap). This same approach powers interval problems like Merge Intervals and Meeting Rooms II, though those use simpler counters or min heaps instead of max heaps.
Max heap for active tracking
The heap answers the question "what is the tallest active building right now?" in O(1), with O(log n) updates. Any problem that asks about the current maximum (or minimum) of a changing set is a candidate for this pattern.
Event-driven processing
Instead of iterating over every possible x-coordinate, you only process positions where something interesting happens (a building starts or ends). This is the key to efficiency: you go from potentially millions of x-positions to just 2n events.
Edge cases
Single building. One building [left, right, height] produces exactly two key points: [left, height] and [right, 0]. The algorithm handles this naturally.
Adjacent buildings. Buildings like [1,5,10] and [5,8,10] with the same height and touching edges. The skyline should not dip to 0 between them. The sorting and height-change check handle this: at x=5, the end of the first and start of the second are processed, and since the max height remains 10, no extra key point is emitted.
Nested buildings. A small building completely inside a taller one does not affect the skyline at all. The max heap correctly ignores it because the taller building's height stays on top.
Buildings with the same height. Multiple buildings at the same height that overlap or touch. The algorithm only emits a key point when the maximum changes, so identical heights produce no spurious points.
From understanding to recall
The Skyline Problem is the kind of problem that feels approachable once you see the solution but is difficult to reconstruct from scratch. The event encoding ((left, -height, right) vs (right, 0, 0)), the lazy deletion loop, the negation trick for Python's min heap. These details are easy to mix up.
The way to make it stick is not to memorize the code but to internalize the reasoning: "events at building edges, max heap for active heights, emit when max changes." Once that mental model is solid, the implementation details follow naturally.
Spaced repetition is the fastest path to that kind of fluency. Write the solution from scratch, verify it, and come back in a few days to do it again. After three or four reps, sweep line plus max heap becomes a tool you can reach for confidently.
Related posts
- Merge Intervals - Sort intervals by start, then sweep and merge overlapping ranges
- Meeting Rooms II - Min heap scheduling to find the peak number of concurrent meetings
- Largest Rectangle in Histogram - Monotonic stack to find the largest rectangle under a histogram