Count Primes: The Sieve of Eratosthenes
Given an integer n, return the number of prime numbers that are strictly less than n. A prime is a number greater than 1 that has no divisors other than 1 and itself. Sounds simple enough, but the naive approach is painfully slow for large inputs, and the elegant solution dates back over two thousand years.
This is LeetCode 204: Count Primes, and it is one of the best problems for learning the Sieve of Eratosthenes, an ancient algorithm that remains one of the most efficient ways to find all primes up to a given limit.
Why this problem matters
Count Primes is a gateway to number theory in coding interviews. The sieve pattern teaches you something that brute force checking does not: instead of testing each number individually, you can eliminate entire families of composites in bulk. That shift from "check one at a time" to "mark many at once" is a thinking pattern that shows up in other problems too.
Prime numbers appear constantly in computer science. Hash table sizing, cryptographic key generation, randomized algorithms, and number-theoretic transforms all rely on properties of primes. Understanding how to efficiently enumerate primes gives you a foundation that extends well beyond this single LeetCode problem.
The sieve also introduces you to the idea of precomputation. Rather than answering "is this number prime?" over and over, you build a lookup table once and then read off results in O(1). That trade-off between upfront work and fast queries is a recurring theme in algorithm design.
Thinking it through
The core question is: how do you count all primes below n?
The most direct approach is to check each number from 2 to n - 1 individually. For each candidate, test whether any smaller number divides it evenly. If nothing divides it, it is prime. This works, but it is slow.
The sieve flips the logic. Instead of testing each number, you start with the assumption that every number is prime, and then systematically cross out the ones that are not. You begin with 2, the smallest prime, and cross out all its multiples (4, 6, 8, ...). Then move to 3 and cross out its multiples (9, 15, 21, ...). Then 5, and so on. Every number that survives the crossing-out process is prime.
The key optimization: you only need to sieve up to the square root of n. If a number has a factor larger than its square root, it must also have a factor smaller than its square root, and that smaller factor would have already crossed it out.
The brute force approach
For each number from 2 to n - 1, check if it is prime by testing all potential divisors up to its square root.
def countPrimes(n):
count = 0
for num in range(2, n):
is_prime = True
for d in range(2, int(num ** 0.5) + 1):
if num % d == 0:
is_prime = False
break
if is_prime:
count += 1
return count
This runs in roughly O(n * sqrt(n)) time. For n = 5,000,000, that is billions of operations. It will time out on LeetCode.
The Sieve of Eratosthenes
The sieve eliminates composites in bulk instead of checking each number one at a time. You create a boolean array where index i represents whether i is prime. Start by assuming everything is prime, then iterate through the array crossing out multiples.
Step 1: Start at 2 and cross out all multiples of 2
2 is prime. Cross out 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30. That eliminates 14 composites in one pass.
Step 2: Move to 3 and cross out all multiples of 3
3 is prime (not crossed out). Cross out 9, 15, 21, 27. Multiples like 6, 12, 18 were already crossed out by the previous step.
Step 3: Move to 5 and cross out all multiples of 5
5 is prime (not crossed out). Cross out 25. All other multiples of 5 (10, 15, 20, 30) were already eliminated. Since 5 * 5 = 25 and the next prime 7 satisfies 7 * 7 = 49 which is past 30, we can stop.
The remaining unmarked numbers are all prime: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29. For countPrimes(31), the answer is 10.
There are two important optimizations baked into the algorithm:
-
Start crossing out at
p * p, not at2 * p. All smaller multiples ofphave already been crossed out by smaller primes. For example, when you reachp = 5, the multiples 10, 15, and 20 were already eliminated by 2 and 3. The first uncrossed multiple of 5 is5 * 5 = 25. -
Stop when
p * p >= n. Any composite number less thannmust have a factor less than or equal tosqrt(n). Once you have sieved all primes up tosqrt(n), every remaining unmarked number is guaranteed to be prime.
def countPrimes(n):
if n < 3:
return 0
is_prime = [True] * n
is_prime[0] = False
is_prime[1] = False
p = 2
while p * p < n:
if is_prime[p]:
for multiple in range(p * p, n, p):
is_prime[multiple] = False
p += 1
return sum(is_prime)
The inner loop for each prime p marks roughly n / p composites. Summing over all primes gives a total of O(n log log n) operations, which is nearly linear and fast enough for any constraint LeetCode throws at you.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Brute force (trial division) | O(n * sqrt(n)) | O(1) |
| Sieve of Eratosthenes | O(n log log n) | O(n) |
The sieve trades O(n) space for a massive speedup in time. For n = 5,000,000, the sieve finishes in milliseconds while brute force takes seconds or more. The space cost is a boolean array of size n, which is very manageable.
Edge cases to watch for
- n
<2: There are no primes less than 2. Return 0. This is the most common off-by-one trap. - n = 2: The only prime candidate would be 2 itself, but the problem asks for primes strictly less than
n. Return 0. - n = 3: Only 2 is prime and less than 3. Return 1.
- Large n: The sieve handles
nup to 5,000,000 comfortably. Just make sure your boolean array fits in memory (it will). - Off-by-one on the boundary: The problem says "strictly less than
n", so ifn = 10, you count primes in [2, 9], not [2, 10]. The sieve naturally handles this because the array has sizenand indices go from 0 ton - 1.
The building blocks
This problem is built on the Sieve of Eratosthenes pattern, which is a specific instance of a broader idea: precomputation with bulk elimination. Instead of answering a question one element at a time, you sweep through the data and mark large chunks at once, then read off the answer.
The structure:
- State: a boolean array
is_prime[0..n-1]initialized toTrue - Elimination: for each prime
p, mark all multiplesp*p, p*p+p, p*p+2p, ...as composite - Termination: stop when
p * p >= n - Result: count the remaining
Trueentries
The same sieve idea appears in related problems:
- Factorial Trailing Zeroes (LeetCode 172): counting factors of 5 uses a similar "sweep and accumulate" approach over a range, though without the boolean array
- Happy Number (LeetCode 202): detecting cycles in a number-theoretic sequence requires understanding digit manipulation, the same kind of math reasoning that underpins primality
- Divide Two Integers (LeetCode 29): bit manipulation and repeated doubling share the same spirit of "do work in bulk rather than one step at a time"
Once you internalize the sieve as a reusable building block, you will recognize the bulk-elimination pattern whenever it appears.
From understanding to recall
You have just walked through the Sieve of Eratosthenes from brute force to optimized, and it probably makes sense right now. But could you write it from scratch in a week? Could you remember to start at p * p instead of 2 * p, or recall exactly how to handle the n < 2 edge case under interview pressure?
Understanding and recall are different skills. Spaced repetition bridges that gap. You revisit the sieve at increasing intervals, write it from memory each time, and after a few reps the algorithm becomes automatic. No more second-guessing whether your loop should be while p * p < n or while p * p <= n.
Related posts
- Factorial Trailing Zeroes - Another math problem where you count factors across a range instead of computing the full result
- Happy Number - Cycle detection in a number-theoretic sequence, combining digit manipulation with set-based tracking
- Palindrome Number - Math-based number analysis without string conversion, using modular arithmetic