Skip to content
← All posts

Majority Element II: Boyer-Moore Voting for n/3 Threshold

5 min read
leetcodeproblemmediumarrays

LeetCode Majority Element II (problem 229) is a clever follow-up to the classic Majority Element problem. Instead of finding the element that appears more than n / 2 times, you need to find all elements that appear more than n / 3 times. The trick is that there can be at most two such elements, and you can find them in O(n) time with O(1) space using an extended version of the Boyer-Moore voting algorithm.

302132At most 2 elements can appear more than n/3 timescand13cand2-3 appears 2 times out of 3 (more than n/3 = 1)
Given nums = [3, 2, 3], return [3]. The value 3 appears more than n/3 times. The extended Boyer-Moore voting algorithm tracks two candidate slots.

The approach

The key observation is mathematical: at most two elements can appear more than n / 3 times in an array of length n. Think about it. If three elements each appeared more than n / 3 times, the total count would exceed n, which is impossible.

This means you only need two candidate slots. The extended Boyer-Moore voting algorithm works like this:

  1. Maintain two candidates (cand1, cand2) and two counters (count1, count2).
  2. For each element in the array:
    • If it matches cand1, increment count1.
    • Else if it matches cand2, increment count2.
    • Else if count1 is 0, set cand1 to this element and count1 to 1.
    • Else if count2 is 0, set cand2 to this element and count2 to 1.
    • Otherwise, decrement both counts. (This "cancels" one occurrence of each candidate with one non-candidate element.)
  3. After the first pass, you have two candidates. But unlike the n / 2 version, having a majority is not guaranteed. You need a second pass to verify that each candidate actually appears more than n / 3 times.

The verification step is critical. Without it, you could return elements that are not actually above the threshold.

The solution

def majority_element(nums: list[int]) -> list[int]:
    cand1, cand2, count1, count2 = None, None, 0, 0

    for num in nums:
        if num == cand1:
            count1 += 1
        elif num == cand2:
            count2 += 1
        elif count1 == 0:
            cand1, count1 = num, 1
        elif count2 == 0:
            cand2, count2 = num, 1
        else:
            count1 -= 1
            count2 -= 1

    threshold = len(nums) // 3
    result = []
    if cand1 is not None and nums.count(cand1) > threshold:
        result.append(cand1)
    if cand2 is not None and nums.count(cand2) > threshold:
        result.append(cand2)
    return result
  1. Two candidate slots. You track two potential majority elements simultaneously. This mirrors the fact that at most two elements can exceed n / 3.

  2. Order of checks matters. You must check whether the current element matches an existing candidate before checking if a count is 0. Otherwise you could assign the same value to both candidate slots.

  3. Triple cancellation. When the element matches neither candidate and both counts are positive, decrementing both counts is like "canceling" three distinct elements (one from each candidate slot and the current element). Any element with more than n / 3 occurrences survives this process.

  4. Verification pass. The first pass only identifies candidates. The second pass counts their actual occurrences to confirm they exceed the n / 3 threshold. This is the main difference from the original Majority Element problem, where a majority is guaranteed.

  5. Return value is a list. The result can contain 0, 1, or 2 elements depending on how many values exceed the threshold.

Step-by-step walkthrough

Let's trace through nums = [1, 2, 1, 1, 3, 2, 2] to see how the two candidate slots and their counts evolve, followed by the verification pass.

Step 1: nums[0] = 1. count1 is 0, so set cand1 = 1, count1 = 1.

10211213342526currentcand1:1c1:1cand2:-c2:0

First element becomes candidate 1.

Step 2: nums[1] = 2. count2 is 0, so set cand2 = 2, count2 = 1.

10211213342526currentcand1:1c1:1cand2:2c2:1

2 is not cand1, and count2 is 0. Second slot gets filled.

Step 3: nums[2] = 1. Matches cand1. count1 = 2.

10211213342526currentcand1:1c1:2cand2:2c2:1

Matches candidate 1, increment count1.

Step 4: nums[3] = 1. Matches cand1. count1 = 3.

10211213342526currentcand1:1c1:3cand2:2c2:1

Another match for candidate 1.

Step 5: nums[4] = 3. No match. Decrement both counts.

10211213342526currentcand1:1c1:2cand2:2c2:0

3 matches neither candidate. Both counts decrease by 1. count2 drops to 0.

Step 6: nums[5] = 2. count2 is 0, so set cand2 = 2, count2 = 1.

10211213342526currentcand1:1c1:2cand2:2c2:1

count2 was 0, so 2 becomes the new candidate 2.

Step 7: nums[6] = 2. Matches cand2. count2 = 2.

10211213342526currentcand1:1c1:2cand2:2c2:2

Voting phase done. Candidates are 1 and 2. Now verify.

Verification: count actual occurrences of candidates 1 and 2.

10211213342526cand1:1c1:3cand2:2c2:3actual counts: 1 appears 3x, 2 appears 3x

Both 1 and 2 appear 3 times, which exceeds n/3 = 7/3 = 2.33. Result: [1, 2].

Complexity analysis

ApproachTimeSpaceNotes
Hash map (frequency counter)O(n)O(n)Count all, filter by threshold
Extended Boyer-Moore votingO(n)O(1)Two passes, constant extra space

The hash map approach is perfectly fine and easier to implement. But the Boyer-Moore extension achieves the same linear time with constant space, which is the optimal you can do for this problem.

Building blocks

Extended Boyer-Moore voting

The original Boyer-Moore voting algorithm tracks one candidate for the n / 2 threshold. This problem generalizes it to two candidates for n / 3. The same idea extends further: for a n / k threshold, you would track k - 1 candidates. The core principle is always the same: elements appearing more than n / k times cannot be fully "canceled out" by the remaining elements.

Frequency counter

The verification pass is a basic frequency count. You could also solve the entire problem with a hash map counting every element's frequency, then filtering for counts above n / 3. That approach trades O(1) space for simpler logic.

Two-pass verification

Many voting and streaming algorithms produce candidates that need verification. The pattern of "find candidates in pass 1, confirm in pass 2" shows up in problems where the input does not guarantee a winner exists. Recognizing when verification is needed (and when it is not) is an important skill.

Edge cases

  • Empty array. Return an empty list. There are no elements to check.
  • Single element. [5] returns [5] because 1 occurrence is more than 1 / 3.
  • All elements the same. [2, 2, 2, 2] returns [2]. Only one candidate survives, and it clearly exceeds the threshold.
  • No element exceeds the threshold. [1, 2, 3] returns an empty list. Each appears once, and 1 is not more than 3 / 3 = 1. The verification pass catches this.
  • Exactly two majority elements. [1, 1, 2, 2, 3] returns [1, 2]. Both appear twice, which exceeds 5 / 3 = 1.67.
  • Threshold boundary. An element appearing exactly n / 3 times does not qualify. The requirement is strictly greater than n / 3.

The order of the if/elif checks in the voting phase is crucial. Always check for candidate matches first, then check for zero counts. Swapping this order can cause both slots to hold the same value, which breaks the algorithm.

From understanding to recall

The extended Boyer-Moore voting algorithm is one of those techniques that feels clear when you read through it but slips away when you try to write it from scratch under pressure. The tricky parts are the order of the conditional checks and remembering that you need a verification pass. Spaced repetition locks in both the structure and the edge cases so that when you see a "more than n/k" threshold problem, you reach for the right tool automatically.

Related problems

  • Majority Element - The original Boyer-Moore voting problem with a single candidate and n / 2 threshold
  • Top K Frequent Elements - Another frequency-based problem, solved optimally with bucket sort
  • Contains Duplicate - Uses a hash set to track seen elements, a simpler version of frequency counting