Skip to content
← All posts

Sliding Window Median: Two Heaps with Lazy Deletion

9 min read
leetcodeproblemhardarraysheapsliding-window

LeetCode 480 Sliding Window Median takes the sliding window concept to the next level. You are given an array of integers and a window size k. Slide the window from left to right and return the median of each window. If k is odd, the median is the middle element. If k is even, it is the average of the two middle elements. Finding the max in a sliding window was hard enough, but maintaining the median requires a fundamentally different data structure.

The problem

Given an array nums and an integer k, return an array of the median of every contiguous subarray of length k.

Input:  nums = [1, 3, -1, -3, 5, 3, 6, 7], k = 3
Output: [1.0, -1.0, -1.0, 3.0, 5.0, 6.0]

The first window is [1, 3, -1]. Sorted, that is [-1, 1, 3], and the median is 1. The second window is [3, -1, -3]. Sorted: [-3, -1, 3], median is -1. And so on.

1031-12-3354356677window (k=3), median = -1max-heap (lower)-1-3topmin-heap (upper)5median = top of max-heap = -1
nums = [1, 3, -1, -3, 5, 3, 6, 7], k=3. Window covers indices 2-4 with values [-1, -3, 5]. The max-heap holds the lower half, the min-heap holds the upper half. The median comes from the top of the max-heap.

Why this problem matters

Sliding Window Maximum (LeetCode 239) lets you get away with a monotonic deque because you only need the extreme value. The median is not an extreme value. It is the middle value, which means you need to maintain the sorted order of the window at all times. A brute-force approach of sorting each window takes O(nk log k), and that is too slow for large inputs.

The trick is to split the window into two halves: a max-heap for the lower half and a min-heap for the upper half. The top of the max-heap is the largest element in the lower half, and the top of the min-heap is the smallest element in the upper half. When the heaps are balanced, the median sits right at the boundary between them.

This is the same idea behind LeetCode 295 (Find Median from Data Stream), but with a twist: elements also leave the window. In a stream you only add elements. Here you must also remove the element that slides out. Standard heaps do not support efficient arbitrary removal, so you need lazy deletion: mark elements as removed but only actually delete them when they appear at the top of a heap.

The SortedList approach from Python's sortedcontainers library gives a cleaner solution with O(log k) insertion and deletion. But the two-heap approach is more universal and the one interviewers expect you to know.

The two-heap approach

Here is the strategy:

  1. Max-heap (lower half): Stores the smaller half of the window. The top is the largest of the small elements.

  2. Min-heap (upper half): Stores the larger half of the window. The top is the smallest of the large elements.

  3. Balance rule: The max-heap has either the same number of elements as the min-heap (when k is even) or one more (when k is odd). This way, the median is either the top of the max-heap (odd k) or the average of both tops (even k).

  4. Lazy deletion: When an element leaves the window, increment a counter in a hash map. Do not remove it from the heap immediately. When you try to read the top of a heap and the top element has a pending deletion, pop it and decrement the counter. This keeps each operation amortized O(log k).

The SortedList solution

The cleanest Python implementation uses SortedList from sortedcontainers, which gives O(log k) insert and O(log k) delete by index:

from sortedcontainers import SortedList

def median_sliding_window(nums, k):
    window = SortedList()
    result = []

    for i in range(len(nums)):
        window.add(nums[i])

        if len(window) > k:
            window.remove(nums[i - k])

        if len(window) == k:
            if k % 2 == 1:
                result.append(float(window[k // 2]))
            else:
                result.append((window[k // 2 - 1] + window[k // 2]) / 2.0)

    return result

This is compact and correct. For each new element, you add it to the sorted list. If the window exceeds size k, you remove the element that just fell out. Once the window is exactly k elements, you read the median directly by index.

The SortedList.add and SortedList.remove operations both run in O(log k) time. Reading by index is O(log k) as well. Total time: O(n log k).

The two-heap solution

For interviews where you cannot use sortedcontainers, here is the two-heap approach with lazy deletion:

import heapq
from collections import defaultdict

def median_sliding_window(nums, k):
    small = []
    large = []
    delayed = defaultdict(int)
    small_size = 0
    large_size = 0
    result = []

    def add_num(val):
        nonlocal small_size, large_size
        if not small or val <= -small[0]:
            heapq.heappush(small, -val)
            small_size += 1
        else:
            heapq.heappush(large, val)
            large_size += 1
        rebalance()

    def remove_num(val):
        nonlocal small_size, large_size
        delayed[val] += 1
        if val <= -small[0]:
            small_size -= 1
        else:
            large_size -= 1
        rebalance()

    def rebalance():
        nonlocal small_size, large_size
        if small_size > large_size + 1:
            val = -heapq.heappop(small)
            heapq.heappush(large, val)
            small_size -= 1
            large_size += 1
            prune(small)
        elif large_size > small_size:
            val = heapq.heappop(large)
            heapq.heappush(small, -val)
            large_size -= 1
            small_size += 1
            prune(large)

    def prune(heap):
        while heap:
            val = -heap[0] if heap is small else heap[0]
            if delayed[val] > 0:
                delayed[val] -= 1
                heapq.heappop(heap)
            else:
                break

    def get_median():
        if k % 2 == 1:
            return float(-small[0])
        return (-small[0] + large[0]) / 2.0

    for i in range(k):
        add_num(nums[i])
    result.append(get_median())

    for i in range(k, len(nums)):
        add_num(nums[i])
        remove_num(nums[i - k])
        prune(small)
        prune(large)
        result.append(get_median())

    return result

The small heap is a max-heap (negated values for Python's min-heap). The large heap is a min-heap. The delayed dictionary tracks how many times each value needs to be lazily removed. When you call remove_num, you do not actually pop from the heap. You just mark the value for future removal and adjust the logical size. The prune function cleans up the heap top whenever a delayed value surfaces.

The prune step is what makes lazy deletion work. You only prune the top of the heap, and only when you need to read from it or after a rebalance. This keeps the amortized cost low.

Visual walkthrough

Let's trace through nums = [1, 3, -1, -3, 5, 3, 6, 7] with k = 3. At each step, the window slides one position to the right. The max-heap holds the lower half, the min-heap holds the upper half, and the median comes from the top of the max-heap (since k is odd).

Step 1: Window [1, 3, -1]. Insert all three. Sorted: [-1, 1, 3]. Median = 1.

13-1-35367--1-----sorted window: [-1, 1, 3]max-heap1-1min-heap3

First window formed. Max-heap has [-1, 1] (top = 1). Min-heap has [3]. Median is top of max-heap = 1.

Step 2: Slide right. Remove 1, add -3. Window [3, -1, -3]. Sorted: [-3, -1, 3]. Median = -1.

13-1-35367--1-1----sorted window: [-3, -1, 3]max-heap-1-3min-heap3

Remove 1 from max-heap, add -3 to max-heap. Rebalance. Max-heap top = -1. Median = -1.

Step 3: Slide right. Remove 3, add 5. Window [-1, -3, 5]. Sorted: [-3, -1, 5]. Median = -1.

13-1-35367--1-1-1---sorted window: [-3, -1, 5]max-heap-1-3min-heap5

Remove 3 from min-heap, add 5 to min-heap. Heaps stay balanced. Median = -1.

Step 4: Slide right. Remove -1, add 3. Window [-3, 5, 3]. Sorted: [-3, 3, 5]. Median = 3.

13-1-35367--1-1-13--sorted window: [-3, 3, 5]max-heap3-3min-heap5

Remove -1 from max-heap, add 3. Rebalance so max-heap top = 3. Median = 3.

Step 5: Slide right. Remove -3, add 6. Window [5, 3, 6]. Sorted: [3, 5, 6]. Median = 5.

13-1-35367--1-1-135-sorted window: [3, 5, 6]max-heap53min-heap6

Remove -3 from max-heap, add 6 to min-heap. Rebalance. Max-heap top = 5. Median = 5.

Step 6: Slide right. Remove 5, add 7. Window [3, 6, 7]. Sorted: [3, 6, 7]. Median = 6. Done!

13-1-35367--1-1-1356sorted window: [3, 6, 7]max-heap63min-heap7

Remove 5 from max-heap, add 7 to min-heap. Rebalance. Median = 6. Final result: [1, -1, -1, 3, 5, 6].

A few things to notice:

  • The heaps stay balanced. After every insertion and removal, the max-heap has exactly one more element than the min-heap (since k = 3 is odd). The rebalance step ensures this invariant holds.
  • Lazy deletion defers work. When an element leaves the window, it might not sit at the top of its heap. Instead of searching for it, you mark it and clean up later. This is the key to keeping operations at O(log k).
  • The median is always at the boundary. The top of the max-heap is the largest value in the lower half. For odd k, that is exactly the median. For even k, you average it with the top of the min-heap.

Complexity analysis

ApproachTimeSpace
Brute force (sort each window)O(nk log k)O(k)
SortedListO(n log k)O(k)
Two heaps with lazy deletionO(n log k)O(n)

Time: O(n log k). Each of the n elements is inserted into a heap (O(log k)) and removed once (O(log k) amortized with lazy deletion). The SortedList approach has the same time complexity.

Space: O(k) for SortedList, O(n) for two heaps. The SortedList only holds k elements at a time. The two-heap approach may accumulate up to O(n) delayed entries in the worst case, though in practice most entries are cleaned up quickly.

Edge cases

Before submitting, verify your solution handles these:

  • k equals 1. Every element is its own median. The output is just the input converted to floats.
  • k equals n. There is only one window. Sort the entire array and return the single median.
  • k is even. The median is the average of the two middle elements. Make sure you return a float, not an integer.
  • All elements the same like [5, 5, 5, 5] with k = 2. Every window median is 5.0. Duplicates must be handled correctly in both heaps.
  • Negative numbers. The max-heap negation trick must handle negatives correctly. Negating a negative gives a positive, and that is fine for heap ordering.
  • Large values. When averaging two elements for even k, the sum might overflow in some languages. In Python this is not an issue, but in C++ or Java you should use a + (b - a) / 2.0 instead of (a + b) / 2.0.
  • Integer vs float output. LeetCode expects float output. Make sure you cast appropriately.

The building blocks

This problem combines two reusable patterns:

1. Two-heap median maintenance

The core pattern for tracking the median of a dynamic collection:

if val <= max_heap_top:
    push to max-heap
else:
    push to min-heap
rebalance so sizes differ by at most 1

This is the same pattern from Find Median from Data Stream (LeetCode 295). Once you can maintain a running median with two heaps, you can apply it to any window-based or stream-based median problem.

2. Lazy deletion for sliding windows

The technique for removing elements from heaps without searching:

delayed[val] += 1
adjust logical size
prune heap top when needed

Lazy deletion is useful any time you need to remove arbitrary elements from a heap but only care about the top. You see this pattern in problems involving scheduled events, priority queues with cancellation, and any sliding window over a heap-based structure.

A common mistake is forgetting to prune both heaps after a removal. If you only prune the heap that lost an element, the other heap might have a stale top after rebalancing. Always prune both tops before reading the median.

From understanding to recall

You now understand how two heaps maintain the median and how lazy deletion handles element removal. But the details are tricky. The negation for the max-heap, the rebalance logic, the prune function that cleans up stale tops: these are exactly the pieces that slip away under interview pressure.

The SortedList version is easier to remember, but interviewers often want to see the heap approach because it demonstrates deeper understanding of data structures. You need to know when to prune, how to track logical sizes separately from physical heap sizes, and how the rebalance keeps the heaps within one element of each other.

Spaced repetition locks in these details. You write the two-heap median from scratch at increasing intervals. Each time, you rebuild the add, remove, rebalance, and prune functions from memory. After a few repetitions, the pattern becomes automatic and you can apply it to any problem that needs a running median with removals.

Related posts

  • Sliding Window Maximum - The classic hard sliding window problem using a monotonic deque. A great comparison for how different window queries need different data structures.
  • Find Median from Data Stream - The foundation for the two-heap technique. This problem is add-only (no removals), making it the perfect stepping stone before tackling sliding window median.
  • Minimum Window Substring - Another sliding window problem with a different flavor. Uses frequency counting instead of heaps, and the window size varies rather than being fixed.

CodeBricks breaks the two-heap median pattern into its reusable pieces and drills them individually. You practice the heap insertion with rebalancing, the lazy deletion with pruning, and the median extraction separately. Then you combine them under timed conditions so that when you see Sliding Window Median in an interview, every piece falls into place automatically.