Skip to content
← All posts

Count Primes: The Sieve of Eratosthenes

6 min read
leetcodeproblemmediummath

Given an integer n, return the number of prime numbers that are strictly less than n. A prime is a number greater than 1 that has no divisors other than 1 and itself. Sounds simple enough, but the naive approach is painfully slow for large inputs, and the elegant solution dates back over two thousand years.

This is LeetCode 204: Count Primes, and it is one of the best problems for learning the Sieve of Eratosthenes, an ancient algorithm that remains one of the most efficient ways to find all primes up to a given limit.

Sieve of Eratosthenes: 2 through 3023456789101112131415161718192021222324252627282930primecomposite (crossed out)
Numbers 2 through 30 after running the sieve. Green cells are primes (10 total). Crossed-out cells are composites eliminated by marking multiples of 2, 3, and 5.

Why this problem matters

Count Primes is a gateway to number theory in coding interviews. The sieve pattern teaches you something that brute force checking does not: instead of testing each number individually, you can eliminate entire families of composites in bulk. That shift from "check one at a time" to "mark many at once" is a thinking pattern that shows up in other problems too.

Prime numbers appear constantly in computer science. Hash table sizing, cryptographic key generation, randomized algorithms, and number-theoretic transforms all rely on properties of primes. Understanding how to efficiently enumerate primes gives you a foundation that extends well beyond this single LeetCode problem.

The sieve also introduces you to the idea of precomputation. Rather than answering "is this number prime?" over and over, you build a lookup table once and then read off results in O(1). That trade-off between upfront work and fast queries is a recurring theme in algorithm design.

Thinking it through

The core question is: how do you count all primes below n?

The most direct approach is to check each number from 2 to n - 1 individually. For each candidate, test whether any smaller number divides it evenly. If nothing divides it, it is prime. This works, but it is slow.

The sieve flips the logic. Instead of testing each number, you start with the assumption that every number is prime, and then systematically cross out the ones that are not. You begin with 2, the smallest prime, and cross out all its multiples (4, 6, 8, ...). Then move to 3 and cross out its multiples (9, 15, 21, ...). Then 5, and so on. Every number that survives the crossing-out process is prime.

The key optimization: you only need to sieve up to the square root of n. If a number has a factor larger than its square root, it must also have a factor smaller than its square root, and that smaller factor would have already crossed it out.

The brute force approach

For each number from 2 to n - 1, check if it is prime by testing all potential divisors up to its square root.

def countPrimes(n):
    count = 0
    for num in range(2, n):
        is_prime = True
        for d in range(2, int(num ** 0.5) + 1):
            if num % d == 0:
                is_prime = False
                break
        if is_prime:
            count += 1
    return count

This runs in roughly O(n * sqrt(n)) time. For n = 5,000,000, that is billions of operations. It will time out on LeetCode.

The Sieve of Eratosthenes

The sieve eliminates composites in bulk instead of checking each number one at a time. You create a boolean array where index i represents whether i is prime. Start by assuming everything is prime, then iterate through the array crossing out multiples.

Step 1: Start at 2 and cross out all multiples of 2

23456789101112131415161718192021222324252627282930

2 is prime. Cross out 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30. That eliminates 14 composites in one pass.

Step 2: Move to 3 and cross out all multiples of 3

23456789101112131415161718192021222324252627282930

3 is prime (not crossed out). Cross out 9, 15, 21, 27. Multiples like 6, 12, 18 were already crossed out by the previous step.

Step 3: Move to 5 and cross out all multiples of 5

23456789101112131415161718192021222324252627282930

5 is prime (not crossed out). Cross out 25. All other multiples of 5 (10, 15, 20, 30) were already eliminated. Since 5 * 5 = 25 and the next prime 7 satisfies 7 * 7 = 49 which is past 30, we can stop.

Result: 10 primes less than 31

The remaining unmarked numbers are all prime: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29. For countPrimes(31), the answer is 10.

There are two important optimizations baked into the algorithm:

  1. Start crossing out at p * p, not at 2 * p. All smaller multiples of p have already been crossed out by smaller primes. For example, when you reach p = 5, the multiples 10, 15, and 20 were already eliminated by 2 and 3. The first uncrossed multiple of 5 is 5 * 5 = 25.

  2. Stop when p * p >= n. Any composite number less than n must have a factor less than or equal to sqrt(n). Once you have sieved all primes up to sqrt(n), every remaining unmarked number is guaranteed to be prime.

def countPrimes(n):
    if n < 3:
        return 0

    is_prime = [True] * n
    is_prime[0] = False
    is_prime[1] = False

    p = 2
    while p * p < n:
        if is_prime[p]:
            for multiple in range(p * p, n, p):
                is_prime[multiple] = False
        p += 1

    return sum(is_prime)

The inner loop for each prime p marks roughly n / p composites. Summing over all primes gives a total of O(n log log n) operations, which is nearly linear and fast enough for any constraint LeetCode throws at you.

Complexity analysis

ApproachTimeSpace
Brute force (trial division)O(n * sqrt(n))O(1)
Sieve of EratosthenesO(n log log n)O(n)

The sieve trades O(n) space for a massive speedup in time. For n = 5,000,000, the sieve finishes in milliseconds while brute force takes seconds or more. The space cost is a boolean array of size n, which is very manageable.

Edge cases to watch for

  • n < 2: There are no primes less than 2. Return 0. This is the most common off-by-one trap.
  • n = 2: The only prime candidate would be 2 itself, but the problem asks for primes strictly less than n. Return 0.
  • n = 3: Only 2 is prime and less than 3. Return 1.
  • Large n: The sieve handles n up to 5,000,000 comfortably. Just make sure your boolean array fits in memory (it will).
  • Off-by-one on the boundary: The problem says "strictly less than n", so if n = 10, you count primes in [2, 9], not [2, 10]. The sieve naturally handles this because the array has size n and indices go from 0 to n - 1.

The building blocks

This problem is built on the Sieve of Eratosthenes pattern, which is a specific instance of a broader idea: precomputation with bulk elimination. Instead of answering a question one element at a time, you sweep through the data and mark large chunks at once, then read off the answer.

The structure:

  • State: a boolean array is_prime[0..n-1] initialized to True
  • Elimination: for each prime p, mark all multiples p*p, p*p+p, p*p+2p, ... as composite
  • Termination: stop when p * p >= n
  • Result: count the remaining True entries

The same sieve idea appears in related problems:

  • Factorial Trailing Zeroes (LeetCode 172): counting factors of 5 uses a similar "sweep and accumulate" approach over a range, though without the boolean array
  • Happy Number (LeetCode 202): detecting cycles in a number-theoretic sequence requires understanding digit manipulation, the same kind of math reasoning that underpins primality
  • Divide Two Integers (LeetCode 29): bit manipulation and repeated doubling share the same spirit of "do work in bulk rather than one step at a time"

Once you internalize the sieve as a reusable building block, you will recognize the bulk-elimination pattern whenever it appears.

From understanding to recall

You have just walked through the Sieve of Eratosthenes from brute force to optimized, and it probably makes sense right now. But could you write it from scratch in a week? Could you remember to start at p * p instead of 2 * p, or recall exactly how to handle the n < 2 edge case under interview pressure?

Understanding and recall are different skills. Spaced repetition bridges that gap. You revisit the sieve at increasing intervals, write it from memory each time, and after a few reps the algorithm becomes automatic. No more second-guessing whether your loop should be while p * p < n or while p * p <= n.

Related posts

  • Factorial Trailing Zeroes - Another math problem where you count factors across a range instead of computing the full result
  • Happy Number - Cycle detection in a number-theoretic sequence, combining digit manipulation with set-based tracking
  • Palindrome Number - Math-based number analysis without string conversion, using modular arithmetic