Minimum Cost to Hire K Workers: Greedy Ratios with a Max-Heap
LeetCode 857, Minimum Cost to Hire K Workers, is a hard problem that trips people up because the optimization target is not obvious. You need to pick exactly k workers to minimize total cost, but each worker has two constraints: a quality rating and a minimum wage. The trick is figuring out what determines how much you actually pay each person.
Once you see it, the solution is clean: sort by wage-to-quality ratio, iterate through workers, and use a max-heap to keep track of the k smallest quality values. Let's break it down.
The problem
You are given n workers. Worker i has a quality[i] and a minimum wage expectation wage[i]. You want to hire exactly k workers to form a paid group. The payment rules are:
- Every worker in the group must be paid at least their minimum wage.
- Every worker in the group must be paid in proportion to their quality, relative to the other workers in the group. In other words, the ratio of payment to quality must be the same for all workers in the group.
Return the minimum total cost to hire exactly k workers.
Constraints:
n == quality.length == wage.length1 <= k <= n <= 100001 <= quality[i], wage[i] <= 10000
Example: quality = [10, 20, 5], wage = [70, 120, 25], k = 2
The answer is 105.0. You can hire Worker 0 and Worker 2, paying them 70 and 35 respectively. Both are paid at a rate of 7.0 per unit of quality, and both meet their minimum wage.
Workers with quality, wage, and wage/quality ratio (k = 2)
Workers 1 and 2 (green) form the cheapest group. The max ratio in the group (6.0) sets the pay rate, and total cost is max_ratio * total_quality.
Why brute force fails
A brute-force approach would try every combination of k workers out of n. That is C(n, k) combinations. For n = 10000 and k = 5000, that number is astronomically large. Even for moderate inputs, this is far too slow.
You also cannot just pick the k cheapest workers by wage, because the proportional payment rule means that picking a low-wage worker with high quality can force you to overpay everyone else. The interaction between quality and wage makes greedy-by-wage alone insufficient.
The key insight
Define the ratio for each worker as wage[i] / quality[i]. This ratio represents the minimum "price per unit of quality" that the worker will accept.
Here is the critical observation: if you pick a group of k workers, everyone in that group must be paid at the same rate per unit of quality (rule 2). That rate must be at least as high as the maximum ratio in the group (rule 1), otherwise someone would be paid below their minimum wage.
So the total cost of a group is:
cost = max_ratio_in_group * sum_of_qualities_in_group
This transforms the problem. To minimize cost, you want to:
- Keep
max_ratio_in_groupas small as possible (use workers with low ratios). - Keep
sum_of_qualities_in_groupas small as possible (use workers with low quality values).
These two goals conflict, which is what makes the problem interesting. A worker with a low ratio might have high quality, and vice versa. The algorithm balances both factors.
Step-by-step walkthrough
The algorithm works like this:
- Compute the ratio
wage[i] / quality[i]for each worker. - Sort workers by ratio in ascending order.
- Iterate through the sorted workers. For each worker, add their quality to a max-heap and to a running
sum_quality. - If the heap size exceeds
k, pop the largest quality from the heap (reducingsum_quality). This keeps only theksmallest quality values among workers considered so far. - When the heap has exactly
kelements, compute the cost ascurrent_ratio * sum_quality. Track the minimum cost across all iterations.
Step 1: Compute wage/quality ratios and sort workers by ratio (ascending).
Sorting by ratio ensures that as we iterate, the current worker always has the highest ratio so far. This ratio becomes the pay rate for the entire group.
Step 2: Process Worker 2 (ratio=5.0, quality=5). Push quality onto max-heap.
Heap = [5], sum_quality = 5. Only 1 worker so far, need k=2. No cost to compute yet.
Step 3: Process Worker 1 (ratio=6.0, quality=20). Push quality, now heap size = k.
Heap = [20, 5], sum_quality = 25. We have k=2 workers. Cost = max_ratio * sum_quality = 6.0 * 25 = 150. Best so far: 150.
Step 4: Process Worker 0 (ratio=7.0, quality=10). Push quality, heap grows beyond k.
After push: heap = [20, 5, 10], sum = 35. Size > k, so pop the max (20). Heap = [10, 5], sum = 15. Cost = 7.0 * 15 = 105. New best!
Step 5: All workers processed. The minimum cost is 105.
Final answer: 105. Workers 0 (quality=10) and 2 (quality=5) are hired. Worker 0 gets paid 70 (its minimum wage), Worker 2 gets paid 35 (ratio 7.0 * quality 5). Total = 70 + 35 = 105.
The code
import heapq
def min_cost_to_hire_workers(
quality: list[int],
wage: list[int],
k: int
) -> float:
# Pair each worker with their ratio, sort by ratio
workers = sorted(
zip(quality, wage),
key=lambda w: w[1] / w[0]
)
heap = [] # max-heap of quality values (negate for max-heap)
sum_quality = 0
best = float("inf")
for q, w in workers:
ratio = w / q
# Push current worker's quality (negated for max-heap)
heapq.heappush(heap, -q)
sum_quality += q
# If we have more than k workers, remove the one
# with the largest quality to minimize sum_quality
if len(heap) > k:
sum_quality += heapq.heappop(heap) # adds negative value
# When we have exactly k workers, compute cost
if len(heap) == k:
best = min(best, ratio * sum_quality)
return best
A few things to notice:
- Python's
heapqis a min-heap. To simulate a max-heap, we push negated values. When we pop, we add the negative back (which subtracts the quality fromsum_quality). - Because we sorted by ratio, the current worker always has the highest ratio in the group. So
ratiois alwaysmax_ratio_in_group. - We only compute cost when the heap has exactly
kelements. Before that, we do not have enough workers to form a valid group.
Why this works
The correctness rests on two observations:
1. The optimal group always uses the ratio of one of its members as the pay rate. The pay rate must be at least the maximum ratio among all chosen workers. Setting it to exactly the maximum ratio minimizes cost, since increasing the rate only increases total payment. So the optimal rate equals the highest ratio in the group.
2. Sorting by ratio and iterating lets us fix the pay rate. When we process worker i in sorted order, their ratio is the highest so far. Any subset of workers from index 0 to i that includes worker i will use ratio[i] as the pay rate. To minimize cost with that rate, we want the k workers (including worker i) with the smallest total quality. The max-heap tracks exactly that.
3. We never miss the optimal group. The optimal group has some worker with the maximum ratio. When we reach that worker in the sorted order, the heap will contain the k - 1 smallest quality values from all previously seen workers, plus the current worker. That is exactly the set that minimizes sum_quality for this pay rate.
Complexity analysis
| Aspect | Value | Reason |
|---|---|---|
| Time | O(n log n) | Sorting takes O(n log n). The heap loop processes n workers, each push/pop is O(log k). Total: O(n log n + n log k) = O(n log n). |
| Space | O(n) | Storing the sorted workers list takes O(n). The heap holds at most k elements, so O(k). Total: O(n). |
This is optimal for this problem. You need to examine every worker at least once, and sorting is the bottleneck. The heap operations within the loop are O(log k) each, which is dominated by the sort.
The building blocks
This problem combines two core building blocks:
Greedy sort by ratio
Many optimization problems have a hidden "rate" or "efficiency" metric. When you compute it and sort by it, the problem often decomposes into a simpler greedy scan. This same idea shows up in fractional knapsack, job scheduling by deadline-to-penalty ratio, and task assignment problems.
Max-heap to maintain top-k smallest
The max-heap of size k pattern is the flip side of the min-heap pattern from Kth Largest Element. There you keep the k largest values using a min-heap. Here you keep the k smallest values using a max-heap. The idea is the same: when the heap overflows, pop the element you want to discard (the extreme one in the wrong direction).
# Template: maintain k smallest values using a max-heap
max_heap = []
running_sum = 0
for value in stream:
heapq.heappush(max_heap, -value)
running_sum += value
if len(max_heap) > k:
running_sum += heapq.heappop(max_heap) # subtracts largest
Whenever you see "pick k items to minimize/maximize some aggregate," ask yourself: can I sort by one dimension and use a heap to optimize the other? That two-dimensional greedy pattern is the heart of this problem.
Edge cases
k = 1. You only need one worker. The cheapest option is the worker with the lowest wage, since ratio * quality = wage for a single worker. The algorithm handles this: the heap hits size 1 on the first iteration and computes cost = wage for each worker.- All workers have the same ratio. Every group of
kworkers uses the same pay rate. The optimal group is the one with the smallest total quality. The max-heap correctly selects theksmallest. - All workers have the same quality. Every group of
kworkers has the samesum_quality = k * q. The optimal group is the one with the smallest max ratio, which is the firstkworkers in sorted order. The algorithm finds this on its first valid iteration. - Large
k = n. You must hire everyone. There is only one group, and the cost ismax_ratio * total_quality. The algorithm still works: no elements are ever popped from the heap.
From understanding to recall
The hardest part of this problem is remembering the ratio insight under pressure. Once you know that cost = max_ratio * sum_quality, the rest follows naturally: sort by ratio, maintain the k smallest qualities with a max-heap.
The recipe is three lines of reasoning:
- "Equal pay rate means cost = rate * total quality."
- "Rate must be the max ratio in the group."
- "Sort by ratio, heap by quality."
Practicing that chain until it is automatic is what turns a tricky hard problem into a pattern you can execute in ten minutes. Spaced repetition locks it in. Write the solution from scratch, wait a few days, and do it again. By the third rep, you will not need to think about it.
Related posts
- Kth Largest Element in an Array - The classic heap problem that introduces the "maintain top-k with a heap" pattern
- Top K Frequent Elements - Another problem that uses heaps (or bucket sort) to find the top-k items
- Meeting Rooms II - A greedy problem that combines sorting with a heap to track overlapping intervals