Skip to content
← All posts

Allocate Mailboxes: DP Meets the Median Trick

5 min read
leetcodeproblemharddynamic-programmingmathsorting

You are given an array houses containing the positions of houses along a street, and an integer k. You need to place k mailboxes along the street so that the total distance between each house and its nearest mailbox is minimized. Return that minimum total distance.

This is LeetCode 1478: Allocate Mailboxes, a hard problem that combines two powerful ideas: the mathematical fact that the median minimizes total absolute deviation, and interval DP for partitioning an array into groups. Once you see how these two pieces fit together, the solution writes itself.

For houses = [1, 4, 8, 10, 20] and k = 3, the answer is 5. You place mailboxes at positions 4, 8, and 20, giving groups [1, 4], [8, 10], and [20] with a total distance of |1-4| + |4-4| + |8-8| + |10-8| + |20-20| = 3 + 0 + 0 + 2 + 0 = 5.

051015201481020housesmailboxestotal distance = |1-4| + |10-8| = 5
houses = [1, 4, 8, 10, 20], k = 3. Three mailboxes placed at positions 4, 8, and 20 minimize the total distance to 5.

Why this problem matters

Allocate Mailboxes is one of the cleanest examples of a problem where a mathematical insight (median minimizes absolute distance) powers a DP optimization. Without the median observation, you would need to search over all possible mailbox positions, making the problem intractable. With it, you reduce the inner computation to a simple precomputable cost function, and the rest is standard interval partitioning DP.

This pattern of "precompute a cost function, then partition with DP" shows up across many hard problems. If you can recognize when a subproblem has a closed-form or efficiently computable solution, you can often turn an impossible search into a clean DP.

The key insight

If you have a single mailbox serving a group of houses, where should you place it? You want to minimize the sum of absolute distances from each house to the mailbox. This is a classic result from statistics: the point that minimizes total absolute deviation is the median.

For an odd number of houses, the median is the middle element. For an even number, any point between the two middle elements works (including either middle element itself). This means you never need to search over mailbox positions. For any contiguous group of sorted houses, the optimal single-mailbox cost is just the sum of distances to the median.

The second insight is that after sorting the houses, the optimal assignment always groups contiguous houses together. If house A is between houses B and C on the number line, it never makes sense to assign A to a different mailbox than B and C while assigning B and C to the same one. So the problem becomes: partition the sorted houses array into exactly k contiguous groups, minimizing the total cost across groups.

The solution

def minDistance(houses, k):
    houses.sort()
    n = len(houses)

    cost = [[0] * n for _ in range(n)]
    for i in range(n):
        for j in range(i, n):
            median = houses[(i + j) // 2]
            for m in range(i, j + 1):
                cost[i][j] += abs(houses[m] - median)

    dp = [[float('inf')] * (k + 1) for _ in range(n + 1)]
    dp[0][0] = 0

    for i in range(1, n + 1):
        for j in range(1, min(i, k) + 1):
            for m in range(j - 1, i):
                dp[i][j] = min(dp[i][j], dp[m][j - 1] + cost[m][i - 1])

    return dp[n][k]

The cost table can also be computed with a prefix-sum trick in O(n^2) instead of O(n^3), but the O(n^3) version is clearer and fast enough given the constraint n is at most 100.

Visual walkthrough

Step 1: Sort the houses array

sorted1481020

Sorting ensures that each group of houses assigned to one mailbox is a contiguous subarray. Optimal groups are always contiguous in sorted order.

Step 2: Precompute cost(i, j) - the cost of serving houses[i..j] with one mailbox

example: cost(0, 1) for houses [1, 4]14mediandist = 3cost(0, 1) = |1-1| + |4-1| = 3

For any contiguous subarray, the median minimizes total absolute distance. cost(i, j) = sum of |houses[m] - median| for all m in [i, j].

Step 3: DP to partition n houses into k groups

dp[i][j] table (houses=5, k=3)j=1j=2j=3i=53485dp[5][3] = min over m of (dp[m][2] + cost(m, 4))answer = dp[5][3] = 5

dp[i][j] = minimum total distance using j mailboxes for the first i houses. Try every split point: dp[i][j] = min(dp[m][j-1] + cost(m, i-1)) for all valid m.

Step 4: Read the answer from dp[n][k]

[1, 4]cost = 3+[8, 10]cost = 2+[20]cost = 0=5

The optimal partition is [1, 4], [8, 10], [20] with mailboxes at positions 4, 8, 20. Total distance = 3 + 2 + 0 = 5.

Complexity analysis

ApproachTimeSpace
Brute force (try all partitions)O(k^n)O(n)
DP with precomputed costO(n^2 * k)O(n^2)

Time: O(n^2 * k). Precomputing the cost table takes O(n^3) in the simple version or O(n^2) with prefix sums. The DP itself has O(n * k) states, each considering O(n) transitions, giving O(n^2 * k). With n up to 100 and k up to 100, this is well within limits.

Space: O(n^2) for the cost table. The DP table is O(n * k), which is dominated by the cost table.

The building blocks

1. Median minimizes absolute distance

For a set of numbers, the median minimizes the sum of absolute deviations. This is the mathematical foundation of the cost function. You do not need to search over all possible mailbox positions; the median gives you the answer directly. This fact also appears in problems like Minimum Moves to Equal Array Elements II, where you need to find the target value that minimizes total absolute moves.

2. Interval partitioning DP

The structure of dp[i][j] = minimum cost using j groups for the first i items, with the recurrence dp[i][j] = min(dp[m][j-1] + cost(m, i-1)), is a reusable pattern. You see it in Split Array Largest Sum (partition into k subarrays minimizing the max sum) and Burst Balloons (partition into intervals with a different cost function). The skeleton is always the same: precompute the cost of each interval, then try all split points.

3. Sorting enables contiguous grouping

Sorting the input guarantees that optimal groups are contiguous subarrays. This is a common preprocessing step in DP problems involving distances or costs. Without sorting, you would need to consider all possible subsets, which explodes the search space. With sorting, you only need to consider contiguous partitions, which the DP handles efficiently.

Edge cases

  • k equals n: every house gets its own mailbox. The total distance is 0 since each mailbox sits at its house.
  • k = 1: one mailbox serves all houses. Place it at the median. The cost is the sum of absolute deviations from the median.
  • Two houses, one mailbox: the mailbox goes at either house (or anywhere between them). The cost equals the distance between the two houses.
  • All houses at the same position: total distance is 0 regardless of k. Every mailbox placement is equally good.
  • Houses already sorted: no impact on correctness, but you can skip the sort step.

From understanding to recall

You now know the two pillars of this solution: the median trick and interval partitioning DP. But in an interview, you need to produce both from scratch. You need to recognize that sorting makes groups contiguous, set up the cost table using the median, define the DP state as dp[i][j] for the first i houses with j mailboxes, and write the triple-nested loop without hesitation.

The median trick is easy to forget if you have not drilled it. Under pressure, you might waste time trying to optimize over mailbox positions instead of jumping straight to the median. Spaced repetition locks in the connection: "minimize sum of absolute distances" triggers "use the median" automatically.

Interval partitioning DP is one of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Learning them individually and drilling them with spaced repetition is far more effective than grinding problems randomly and hoping they stick.

Related posts

  • Split Array Largest Sum - Same interval partitioning DP skeleton, but minimizing the maximum subarray sum instead of total cost
  • Burst Balloons - Another interval DP problem where you precompute subproblem costs and combine them with a partition recurrence
  • Minimum Cost For Tickets - A different flavor of partitioning DP where you choose among ticket options to cover travel days