Task Scheduler: Greedy Frequency Counting
Task Scheduler (LeetCode 621) is one of the best examples of a greedy problem where the math just clicks once you see it. You have a list of CPU tasks and a cooldown period n. The same task cannot run again until at least n intervals have passed. Your job is to find the minimum number of intervals the CPU needs to finish all tasks.
The naive approach is to simulate the scheduling process. But there is a direct formula that gives you the answer in O(n) time with no simulation at all. Let's build up to it.
The problem
You are given an array of characters tasks where each character represents a task type, and a non-negative integer n representing the cooldown between two identical tasks. Each interval, the CPU either executes a task or sits idle. Return the minimum number of intervals required to complete all tasks.
Example: tasks = ["A","A","A","B","B","B"], n = 2
The answer is 8. You can schedule it as A B _ A B _ A B. Each A is at least 3 intervals apart (cooldown n = 2 means n + 1 = 3 between same tasks). Same for each B. The two idle slots are unavoidable because we do not have enough distinct tasks to fill every frame.
Why the most frequent task matters
Think about it this way: the task that appears the most often is the bottleneck. It dictates the minimum timeline because it needs the most cooldown gaps. Every other task just fills in around it.
If task A appears 3 times and n = 2, then A alone forces the timeline to be at least A _ _ A _ _ A, which is 7 intervals. Any other tasks just replace some of those idle slots. If we have enough other tasks, the idle slots disappear entirely. If we do not, some idles remain.
That is the core insight behind greedy scheduling: schedule the most frequent task first, then fill the gaps with everything else.
The frame formula
Here is the math. Let f be the frequency of the most frequent task.
- The most frequent task creates
f - 1"frames" of sizen + 1, plus one final slot for the last occurrence. - The base length from these frames is
(f - 1) * (n + 1). - Now count how many tasks share that maximum frequency. Call this
count_max. Each of these tasks adds one extra slot at the end (they all need that final occurrence after the last frame). - The formula gives:
(f - 1) * (n + 1) + count_max. - But if there are so many distinct tasks that idle slots are never needed, the answer is simply the total number of tasks. So the final answer is
max((f - 1) * (n + 1) + count_max, len(tasks)).
Step 1: Count frequencies. A = 3, B = 3. Most frequent count (f) = 3.
Start by counting each task. The most frequent task drives the schedule.
Step 2: Build frames around the most frequent task. (f - 1) frames of size (n + 1).
Place A at every (n+1) = 3 positions. This creates 2 frames of 3 slots each, plus the last A.
Step 3: Fill in B. It fits in the empty slots within each frame.
B has the same frequency as A, so it fills one slot per frame and gets an extra slot at the end.
Step 4: Count total intervals. (f - 1) * (n + 1) + count_of_max = 2 * 3 + 2 = 8.
Answer = max(formula result, total tasks) = max(8, 6) = 8. The idle slots push the total to 8.
The code
Here is the formula approach in Python:
from collections import Counter
def least_interval(tasks: list[str], n: int) -> int:
counts = Counter(tasks)
f = max(counts.values())
count_max = sum(1 for v in counts.values() if v == f)
return max((f - 1) * (n + 1) + count_max, len(tasks))
Let's trace through it with tasks = ["A","A","A","B","B","B"], n = 2:
counts = {"A": 3, "B": 3}f = 3(the max frequency)count_max = 2(both A and B appear 3 times)- Formula:
(3 - 1) * (2 + 1) + 2 = 2 * 3 + 2 = 8 max(8, 6) = 8
That is it. Four lines of actual logic.
Walking through more examples
Let's make sure the formula works for different scenarios.
Example 2: tasks = ["A","A","A","B","B","B"], n = 0
No cooldown. You can run tasks back to back: A B A B A B. That is 6 intervals, which is just the total number of tasks.
f = 3,count_max = 2, formula =(3 - 1) * (0 + 1) + 2 = 4max(4, 6) = 6
The max with len(tasks) handles this. When n is small, the formula underestimates because you never actually need idle slots.
Example 3: tasks = ["A","A","A","A","A","A","B","C","D","E","F","G"], n = 2
counts = {"A": 6, "B": 1, "C": 1, "D": 1, "E": 1, "F": 1, "G": 1}f = 6,count_max = 1(only A has the max)- Formula:
(6 - 1) * (2 + 1) + 1 = 5 * 3 + 1 = 16 max(16, 12) = 16
A dominates the schedule. Even though we have 6 other tasks to fill gaps, 5 frames of 3 slots each need 10 filler slots, and we only have 6 other tasks. So 4 idle slots remain.
When does the formula equal len(tasks) exactly? When there are enough distinct task types to fill every idle slot in every frame. With many different tasks and a small cooldown, you never idle. The formula captures both cases with a single max call.
Why this works (the greedy argument)
The greedy correctness is intuitive. The most frequent task is the constraint. It forces a minimum spacing. Every other task either:
- Fills an idle slot in one of the frames, reducing total idles by one.
- Extends the schedule if all frames are already full (which means we are past the formula's estimate and
len(tasks)takes over).
No matter how you arrange the tasks, you cannot do better than the formula. The most frequent task forces (f - 1) gaps of size n, and you need at least count_max slots after the last gap for every task that ties for the highest frequency. And you obviously need at least len(tasks) intervals since every task must execute once.
The max of these two lower bounds is tight because the greedy arrangement (most frequent first, fill in order) actually achieves it.
Complexity analysis
Time: O(n). Building the frequency counter takes one pass over the tasks. Finding the max and counting ties is one pass over the counter (at most 26 entries for uppercase letters). Everything is linear.
Space: O(1). The counter has at most 26 entries. No additional data structures.
| Approach | Time | Space | Notes |
|---|---|---|---|
| Simulation with heap | O(n log 26) | O(1) | Works but overkill |
| Greedy formula | O(n) | O(1) | Direct computation |
Both approaches are effectively linear since the task alphabet is bounded at 26. But the formula approach is cleaner and avoids the simulation overhead entirely.
Edge cases to watch for
- n = 0. No cooldown. Answer is always
len(tasks). The formula handles this because(f - 1) * 1 + count_maxwill always be at mostlen(tasks)when there is no forced spacing. - All tasks identical.
tasks = ["A","A","A"],n = 2. The schedule isA _ _ A _ _ Awhich is 7. Formula:(3 - 1) * 3 + 1 = 7. Correct. - All tasks distinct.
tasks = ["A","B","C","D"],n = 3. Every task appears once, sof = 1,count_max = 4. Formula:(1 - 1) * 4 + 4 = 4. Same aslen(tasks). No idle needed. - Single task.
tasks = ["A"],n = 100. Formula:(1 - 1) * 101 + 1 = 1. Just one interval.
The building blocks
This problem is built from one core building block:
Frequency counter
Count how many times each task appears. This is the same building block you see in Top K Frequent Elements, Valid Anagram, and Group Anagrams. The template:
from collections import Counter
counts = Counter(tasks)
# or manually:
counts = {}
for task in tasks:
counts[task] = counts.get(task, 0) + 1
What makes Task Scheduler interesting is what comes after counting. Instead of sorting by frequency or grouping by key, you use the max frequency to compute a closed-form answer. The frequency counter gives you the raw data; the greedy formula turns it into the solution in constant time.
The "use max frequency to build a frame" technique also appears in problems like Reorganize String (LeetCode 767) and Rearrange String k Distance Apart (LeetCode 358). Same core idea: the most frequent element dictates the structure, and everything else fills in around it.
The simulation approach (for completeness)
Some interviewers want to see the heap-based simulation. The idea: use a max heap of remaining task counts. Each round, pop up to n + 1 tasks from the heap, execute them, push back any that still have remaining count, and add idle time if needed.
import heapq
from collections import Counter
def least_interval_simulation(tasks: list[str], n: int) -> int:
counts = Counter(tasks)
heap = [-c for c in counts.values()]
heapq.heapify(heap)
time = 0
while heap:
cycle = n + 1
temp = []
while cycle > 0 and heap:
count = heapq.heappop(heap)
if count + 1 < 0:
temp.append(count + 1)
cycle -= 1
time += n + 1 - cycle
if not heap and not temp:
time -= cycle
for c in temp:
heapq.heappush(heap, c)
return time
This works but is more code and harder to get right under pressure. The formula approach is the one to memorize.
Why you will forget this (and how to fix it)
The frequency counting part is automatic at this point. You have written Counter(tasks) plenty of times. The part that slips is the formula: (f - 1) * (n + 1) + count_max. Under time pressure, you might second-guess whether it is f - 1 or f, or n or n + 1, or where count_max goes.
The trick is to visualize the frames. Draw out A _ _ A _ _ A and count the frames. There are f - 1 = 2 full frames, each of size n + 1 = 3. Then one more slot for the final A. That picture makes the formula obvious. Lock in that mental image and the math writes itself.
Spaced repetition is how you make the formula permanent. Type the solution from scratch. After a few reps at increasing intervals, "count frequencies, find max, apply frame formula" becomes muscle memory. No more second-guessing the arithmetic.
Related posts
- Top K Frequent Elements - Uses the same frequency counter building block, then selects the k highest counts with bucket sort
- Merge Intervals - Another scheduling and interval problem that tests a different pattern (sort and sweep instead of greedy counting)
Visual Learner? Understand why the greedy formula is O(n) with our Big O Notation Guide.