Skip to content
← All posts

Minimum Space Wasted From Packaging: Binary Search with Prefix Sums

8 min read
leetcodeproblemhardarraysbinary-searchsorting

You have a pile of packages, each with a different size, and several suppliers who sell boxes. Each supplier offers a set of box sizes. Every package needs to go into a box that is at least as large as the package, and you must use exactly one supplier for all your packages. The waste for a single package is the difference between the box size and the package size. Your goal is to pick the supplier that minimizes the total waste across all packages.

This is LeetCode 1889: Minimum Space Wasted From Packaging. It is rated hard, and the key to solving it efficiently is combining sorting, binary search, and prefix sums.

0246822box=4pkg 013box=4pkg 135box=8pkg 2packagewastebox
packages = [2, 3, 5], supplier boxes = [4, 8]. Packages 2 and 3 fit in box 4. Package 5 fits in box 8. Red regions show waste: (4-2) + (4-3) + (8-5) = 2 + 1 + 3 = 6.

Why this problem matters

Binary search and prefix sums are two of the most reusable techniques in algorithm problems. This problem forces you to combine them in a way that feels natural once you see it, but is not obvious at first. The pattern of sorting data, partitioning it with binary search, and then computing range aggregates with prefix sums comes up in many competitive programming and interview problems. Mastering this combination gives you a powerful tool for any problem that involves assigning items to buckets and computing costs over ranges.

The brute force approach

The naive approach is to try every supplier. For each supplier, try every box size. For each box size, loop through all packages and check if they fit. Sum up the waste.

def minWastedSpace_brute(packages, boxes):
    MOD = 10**9 + 7
    packages.sort()
    n = len(packages)
    best = float('inf')

    for supplier in boxes:
        supplier.sort()
        if supplier[-1] < packages[-1]:
            continue
        waste = 0
        for pkg in packages:
            for box in supplier:
                if box >= pkg:
                    waste += box - pkg
                    break
        best = min(best, waste)

    return best % MOD if best < float('inf') else -1

This works for small inputs, but it is O(n * m) per supplier, where n is the number of packages and m is the number of box sizes. With constraints up to 10^5 packages and 10^5 total boxes, this is too slow.

The key insight: sort, binary search, prefix sums

Here is the idea. Sort the packages. For each supplier, sort their box sizes too. Now, for a given box size, all the packages that fit in it form a contiguous range of the sorted packages array, specifically the packages that are larger than the previous box size and at most the current box size.

You can find the boundary of this range with bisect_right. And once you know the range, you need the sum of all package sizes in that range to compute waste. That is exactly what a prefix sum array gives you in O(1).

For a box that covers packages from index prev to index j - 1:

  • Count of packages: j - prev
  • Total box space used: box * (j - prev)
  • Total package sizes: prefix[j] - prefix[prev]
  • Waste: box * (j - prev) - (prefix[j] - prefix[prev])

The key formula is waste = count * box_size - sum_of_packages. This turns an O(n) loop into O(1) per box size by precomputing the prefix sum once.

Walking through it step by step

Let's trace through the example: packages = [2, 3, 5], boxes = [[4, 8]].

Step 1: Sort the packages array.

2i=03i=15i=2

Sorted packages: [2, 3, 5]. Build prefix sums: [0, 2, 5, 10].

Step 2: Supplier boxes sorted: [4, 8]. Largest box (8) >= largest package (5), so this supplier can handle all packages.

2pkg3pkg|5pkg|box=4box=8

Binary search: bisect_right(packages, 4) = 2 (packages at indices 0,1 fit in box 4). bisect_right(packages, 8) = 3 (package at index 2 fits in box 8).

Step 3: Calculate waste per box. Box 4 holds packages [2,3]: waste = (2 * 4) - (2+3) = 8 - 5 = 3. Box 8 holds package [5]: waste = (1 * 8) - 5 = 3. Total = 6.

2*4=8box 4-5sum1*8=8box 8-5sum

prefix[2] - prefix[0] = 5 - 0 = 5 (sum of packages in box 4). prefix[3] - prefix[2] = 10 - 5 = 5 (sum in box 8). Waste = 4*2 - 5 + 8*1 - 5 = 6.

Step 4: Waste from box 4: 3. Waste from box 8: 3. Supplier total waste = 6.

3box 4 waste3box 8 waste6total

If there were more suppliers, we would repeat Steps 2-3 for each and keep the minimum. Here, the answer is 6.

Step 5: Compare all suppliers and return the minimum total waste modulo 10^9 + 7.

minacross6answer

With only one valid supplier, the answer is 6 % (10^9 + 7) = 6. If no supplier can fit the largest package, return -1.

The sorted packages are [2, 3, 5] and the prefix sum is [0, 2, 5, 10]. The supplier has boxes [4, 8] (already sorted). The largest box (8) is at least as large as the largest package (5), so this supplier is valid.

For box 4: bisect_right([2, 3, 5], 4) = 2. Packages at indices 0 and 1 (sizes 2 and 3) fit. Waste = 4 * 2 - (prefix[2] - prefix[0]) = 8 - 5 = 3.

For box 8: bisect_right([2, 3, 5], 8) = 3. The remaining package at index 2 (size 5) fits. Waste = 8 * 1 - (prefix[3] - prefix[2]) = 8 - 5 = 3.

Total waste for this supplier: 3 + 3 = 6. Since this is the only supplier, the answer is 6.

The complete solution

def minWastedSpace(packages, boxes):
    MOD = 10**9 + 7
    packages.sort()
    n = len(packages)
    prefix = [0] * (n + 1)
    for i in range(n):
        prefix[i + 1] = prefix[i] + packages[i]

    from bisect import bisect_right
    best = float('inf')

    for supplier in boxes:
        supplier.sort()
        if supplier[-1] < packages[-1]:
            continue
        waste = 0
        prev = 0
        for box in supplier:
            j = bisect_right(packages, box)
            waste += box * (j - prev) - (prefix[j] - prefix[prev])
            prev = j
        best = min(best, waste)

    return best % MOD if best < float('inf') else -1

packages.sort() makes it possible to use binary search. Once sorted, all packages that fit in a given box size form a contiguous block.

prefix[i + 1] = prefix[i] + packages[i] builds the prefix sum array. prefix[j] - prefix[prev] gives the total size of packages from index prev to j - 1 in O(1).

supplier.sort() ensures we process box sizes from smallest to largest. Each box picks up the next batch of packages that did not fit in any smaller box.

if supplier[-1] < packages[-1]: continue is the early exit. If the largest box from this supplier cannot hold the largest package, skip the supplier entirely.

j = bisect_right(packages, box) finds the first package index that does not fit in this box. Everything from prev to j - 1 goes into this box.

waste += box * (j - prev) - (prefix[j] - prefix[prev]) is the core formula. The total space allocated is box * count. The total package size is the prefix sum difference. The gap is the waste.

prev = j moves the pointer forward. The next box picks up where this one left off.

Complexity analysis

MetricValue
TimeO(n log n + m log n) where n is package count and m is total boxes across all suppliers
SpaceO(n) for the prefix sum array

Sorting packages takes O(n log n). For each supplier, sorting their boxes takes O(b log b) where b is the number of boxes for that supplier. The total sorting across all suppliers is O(m log m) where m is the total number of boxes. For each box, the binary search is O(log n). So the total work across all suppliers is O(m log n). The dominant terms are O(n log n + m log n), or equivalently O((n + m) log n).

Building blocks

This problem rests on two fundamental building blocks.

Binary search for range partitioning. When data is sorted, bisect_right tells you exactly how many elements fall at or below a threshold. This turns the question "which packages fit in this box?" into a single O(log n) call instead of scanning the entire array.

Prefix sums for range queries. Once you know the range of packages assigned to a box (from binary search), you need their total size. Building a prefix sum array up front lets you answer any "sum from index a to index b" query in O(1). Without prefix sums, you would need to sum the range each time, turning O(1) into O(n).

Whenever you need to partition sorted data into groups and compute an aggregate (sum, count, average) for each group, think binary search + prefix sums. This pair shows up in problems about assigning items to buckets, splitting arrays into segments, and computing range-based costs.

Edge cases

  • No valid supplier: If every supplier's largest box is smaller than the largest package, return -1. The if supplier[-1] < packages[-1]: continue check handles this.
  • Single package: Works fine. Binary search finds the one package, and the formula computes waste for a single element.
  • Single box per supplier: Each supplier has exactly one box size. All packages must fit in that one box. The binary search finds all of them in one step.
  • Box size equals package size: Zero waste for that package. The formula naturally produces box * 1 - pkg = 0 when they match.
  • Very large values: Package sizes and box sizes can be up to 10^5 and there can be 10^5 of them, so the total waste can exceed 32-bit integer range. Use the modulo 10^9 + 7 only at the very end, on the final answer.
  • Duplicate box sizes in a supplier: This does not cause problems. bisect_right will return the same index for duplicate boxes, giving a count of zero and waste of zero for the duplicate.

Common mistakes

  • Forgetting to sort packages. Binary search requires sorted input. If you skip sorting, bisect_right returns garbage.
  • Applying modulo too early. If you take waste % MOD inside the loop, you corrupt the running total and the min() comparison breaks. Only apply modulo to the final answer.
  • Using bisect_left instead of bisect_right. You want the first index where a package is strictly larger than the box. bisect_right gives you the count of packages that are <= the box, which is exactly what you need. bisect_left would exclude packages equal to the box size.
  • Not checking if a supplier can handle all packages. Skipping the supplier[-1] < packages[-1] check leads to wrong answers because some packages would be left unassigned.

From understanding to recall

The algorithm is clean: sort, prefix sum, then for each supplier do a binary search per box and accumulate waste. But "clean" does not mean "easy to reproduce under pressure." In an interview you need to remember the prefix sum formula, the correct bisect variant, and the early-exit condition. These details slip if you have only read through them once.

Spaced repetition locks them in. Write the solution today, again tomorrow, then in three days. After a few rounds, the bisect_right call, the box * (j - prev) - (prefix[j] - prefix[prev]) formula, and the modulo placement become automatic. You stop thinking about which bisect function to use and start thinking about how to explain your approach.

Related posts

The combination of binary search and prefix sums is one of those patterns that, once internalized, makes a whole category of problems feel approachable. LeetCode 1889 is a great vehicle for building that intuition because it forces you to use both tools together in a way that is hard to forget.