Skip to content
← All posts

Finding MK Average: Sorted Window Design Pattern

6 min read
leetcodeproblemharddesignheapsorting

Finding MK Average is LeetCode 1825, rated Hard. You need to design an MKAverage class that is initialized with two integers, m and k. It supports two operations:

  • addElement(num) adds num to a stream of numbers.
  • calculateMKAverage() looks at the last m elements in the stream, sorts them, removes the smallest k and the largest k, and returns the floor of the mean of the remaining m - 2*k elements. If there are fewer than m elements in the stream, it returns -1.

For example, with m = 3 and k = 1, after adding [3, 1, 10], the sorted window is [1, 3, 10]. You trim the bottom 1 element (1) and the top 1 element (10), leaving [3]. The MK average is 3 // 1 = 3.

MKAverage with m = 3, k = 1Stream:3110sort last m = 3 elementsSorted:1310bottom k=1middletop k=1Remove bottom 1 and top 1Middle elements: [3]MK Average = 3 // (3 - 2*1) = 3Trimmed (removed)Middle (averaged)Stream element
With m=3 and k=1, the last 3 elements are sorted, the bottom 1 and top 1 are trimmed, and the mean of the remaining middle element is the MK Average.

Why this problem matters

Trimmed means are everywhere in real systems. Rating platforms remove outlier scores before computing averages. Sensor networks discard noisy readings from the extremes. Financial systems strip the highest and lowest bids before pricing. This problem captures that exact pattern: maintain a sliding window of recent data, trim the extremes, and compute the average of what remains.

The challenge is doing it efficiently. A brute-force approach that sorts the window from scratch on every query is too slow when the stream is long. You need a data structure that supports fast insertion, deletion, and sum queries over sorted partitions.

The key insight

Instead of re-sorting the entire window every time, maintain the last m elements in a sorted data structure. When a new element arrives, insert it into the sorted structure and remove the element that just fell out of the window. To compute the MK average, you need the sum of the middle m - 2*k elements.

The cleanest approach uses a SortedList (from Python's sortedcontainers library). It supports O(log n) insertion and removal, and you can index into it directly. You track the total sum of all m elements in the window. When you need the MK average, subtract the sum of the bottom k elements and the top k elements from the total, then divide by m - 2*k.

This avoids sorting from scratch on each call. The sorted list maintains order incrementally, and the running total lets you compute the middle sum in O(k) time per query.

The running total is the trick that makes this practical. Instead of summing the middle elements from scratch every time, you maintain the total of all m elements and subtract the bottom k and top k sums. This turns the average calculation into a subtraction problem.

The solution

from sortedcontainers import SortedList
from collections import deque

class MKAverage:
    def __init__(self, m: int, k: int):
        self.m = m
        self.k = k
        self.queue = deque()
        self.sl = SortedList()
        self.total = 0

    def addElement(self, num: int) -> None:
        self.queue.append(num)
        self.sl.add(num)
        self.total += num

        if len(self.queue) > self.m:
            old = self.queue.popleft()
            self.sl.remove(old)
            self.total -= old

    def calculateMKAverage(self) -> int:
        if len(self.queue) < self.m:
            return -1

        bottom_sum = sum(self.sl[i] for i in range(self.k))
        top_sum = sum(self.sl[self.m - self.k + i] for i in range(self.k))
        middle_sum = self.total - bottom_sum - top_sum
        return middle_sum // (self.m - 2 * self.k)

Detailed explanation

The queue (a deque) tracks the order elements arrived. It tells you which element to evict when the window is full. The sl (a SortedList) maintains those same elements in sorted order so you can index into the smallest and largest values efficiently.

When addElement is called, the new number goes into both the deque and the sorted list. If the deque exceeds size m, the oldest element is popped from the front of the deque and removed from the sorted list. The running total is updated in both directions: add the new element, subtract the evicted one.

When calculateMKAverage is called, you first check if there are enough elements. If not, return -1. Otherwise, the sorted list contains exactly m elements in sorted order. The bottom k elements are at indices 0 through k - 1, and the top k are at indices m - k through m - 1. You compute their sums, subtract both from the total, and floor-divide by m - 2*k.

The floor division is important. The problem asks for the integer part of the mean, rounding toward zero. Python's // operator does floor division, which for positive numbers gives the same result as truncation.

Visual walkthrough

Here is a step-by-step trace with m = 3 and k = 1. Watch how the sorted window slides and the trimmed average updates.

Step 1: addElement(3) -- stream = [3]

3

Only 1 element so far. We need at least m=3 elements. calculateMKAverage returns -1.

Step 2: addElement(1) -- stream = [3, 1]

31

Only 2 elements. Still fewer than m=3. calculateMKAverage returns -1.

Step 3: addElement(10) -- stream = [3, 1, 10]

1310avg = 3

Now we have m=3 elements. Sorted: [1, 3, 10]. Remove bottom k=1 (value 1) and top k=1 (value 10). Middle = [3]. Average = 3 // 1 = 3.

Step 4: addElement(5) -- stream = [3, 1, 10, 5]

1510avg = 5

Window slides to last 3: [1, 10, 5]. Sorted: [1, 5, 10]. Remove bottom 1 and top 10. Middle = [5]. Average = 5 // 1 = 5.

Step 5: addElement(5) -- stream = [3, 1, 10, 5, 5]

5510avg = 5

Window slides to last 3: [10, 5, 5]. Sorted: [5, 5, 10]. Remove bottom 5 and top 10. Middle = [5]. Average = 5 // 1 = 5.

Step 6: addElement(5) -- stream = [..., 5, 5, 5]

555avg = 5

Window is [5, 5, 5]. Sorted: [5, 5, 5]. Remove bottom 5 and top 5. Middle = [5]. Average = 5 // 1 = 5.

Notice how each step only needs to insert one element and possibly remove one from the sorted list. The window never exceeds size m, so operations stay fast regardless of how many total elements have been added to the stream.

Complexity analysis

OperationTimeSpace
addElementO(log m)O(m)
calculateMKAverageO(k)O(1) extra
Overall space-O(m)

addElement. Inserting into and removing from the SortedList each take O(log m) time. The deque operations are O(1). Updating the running total is O(1). So each call to addElement costs O(log m).

calculateMKAverage. You iterate over k elements at each end to compute the bottom and top sums. This is O(k) per call. If k is large relative to m, you could further optimize this with a Fenwick tree or segmented sums, but for most practical inputs this is fast enough.

Space. The deque and sorted list each store at most m elements, so total space is O(m).

Building blocks

Sorted containers for sliding windows

The SortedList from Python's sortedcontainers library is the workhorse here. It gives you O(log n) insertion and removal with O(1) indexed access. This combination is rare and powerful: you get the benefits of a balanced BST with the convenience of array indexing. Whenever a problem requires maintaining a sorted collection that changes incrementally, a SortedList is the right tool. You will see this pattern in problems involving running medians, order statistics, and ranked data.

Running sums with incremental updates

Instead of recomputing the sum of the middle elements from scratch, you maintain a running total and adjust it as elements enter and leave the window. This technique of maintaining aggregate statistics incrementally rather than recomputing them is a fundamental optimization. It applies to sliding window sums, moving averages, and any scenario where the data changes by small increments but the aggregate is needed frequently.

Edge cases

  • Fewer than m elements. If the stream has not yet received m values, calculateMKAverage returns -1. The deque length check handles this.
  • All elements are equal. If every element in the window is the same value, the bottom k and top k have the same value as the middle. The trimmed average equals that value.
  • k = 0. No trimming occurs. The MK average is the floor mean of all m elements in the window. The code handles this naturally since bottom_sum and top_sum are both 0.
  • m - 2*k = 1. Only one element remains after trimming. The average is just that element. No division rounding issues arise.
  • Large values. Elements can be up to 10^5 and m can be up to 10^5. The running sum can grow large, but Python handles arbitrary precision integers, so overflow is not a concern.

From understanding to recall

You have seen how a SortedList combined with a running total turns a seemingly expensive operation (sort, trim, average) into an efficient incremental update. The data structure design is the core challenge here: choosing the right container so that each operation stays within its time budget.

But knowing the approach once is not the same as being able to reconstruct it under pressure. The details matter: remembering to use a deque for eviction order, a sorted list for rank queries, and a running total to avoid recomputation. These are distinct building blocks that combine to form the solution.

Spaced repetition locks in each piece separately. CodeBricks breaks this problem into its components, sorted containers, sliding window management, and incremental sum tracking, and drills them at intervals tuned to your forgetting curve. After a few review cycles, the design comes together naturally when you encounter similar data structure problems.

Related posts