Skip to content
← All posts

Find the Longest Substring Containing Vowels in Even Counts: Bitmask + HashMap Pattern

6 min read
leetcodeproblemmediumstringshash-mapbit-manipulation

LeetCode Find the Longest Substring Containing Vowels in Even Counts (Problem 1371) asks: given a string s, return the length of the longest substring where every vowel (a, e, i, o, u) appears an even number of times. Consonants can appear any number of times. Vowels that do not appear at all count as appearing zero times, which is even.

For example, given s = "leetcode", the answer is 5. The substring "leetc" contains two e's (even) and no other vowels (zero is even). That is the longest such substring.

01234567leetcodeParity bits after full scan:a0e1i0o1u0mask = 01010
Vowels highlighted in blue. Each vowel toggles one bit in the bitmask. Green = even count (bit 0), yellow = odd count (bit 1).

Why this problem matters

This problem combines two powerful techniques: encoding parity state as a bitmask and using a hash map to find the earliest occurrence of that state. You will see this same combination in problems involving prefix XOR, subarray parity constraints, and any scenario where you need to find the longest or shortest subarray satisfying some modular condition. Mastering this pattern gives you a reusable template for an entire family of problems.

The key insight: parity bitmask

You do not care about the exact counts of each vowel. You only care whether each count is even or odd. That is a single bit of information per vowel. With five vowels, you can encode the entire parity state in a 5-bit integer (values 0 to 31).

As you scan the string left to right, toggle the corresponding bit each time you encounter a vowel. If the bitmask at position i equals the bitmask at an earlier position j, then every vowel appeared an even number of times in the substring s[j+1..i]. Why? Because toggling a bit an even number of times returns it to its original value. Equal bitmasks mean equal parities, which means even counts in between.

To find the longest such substring, you want the earliest index where each bitmask value first appeared. Store the first occurrence of each mask in a hash map. Initialize with {0: -1} because before the string starts, all vowel counts are zero (mask 0), effectively at index -1.

For each position i, if the current mask was seen before at index j, the substring from j+1 to i has length i - j. Track the maximum.

Python solution

def findTheLongestSubstring(s: str) -> int:
    vowel_map = {'a': 0, 'e': 1, 'i': 2, 'o': 3, 'u': 4}
    mask = 0
    seen = {0: -1}
    best = 0
    for i, ch in enumerate(s):
        if ch in vowel_map:
            mask ^= (1 << vowel_map[ch])
        if mask in seen:
            best = max(best, i - seen[mask])
        else:
            seen[mask] = i
    return best

How the solution works

The vowel_map assigns each vowel a bit position (0 through 4). The mask variable holds the running parity state. Each time you see a vowel, you XOR the mask with 1 << bit_position, flipping that vowel's parity bit.

The seen dictionary maps each mask value to the first index where it appeared. You seed it with {0: -1} so that if the mask returns to 0 at index i, the length is i - (-1) = i + 1, covering the entire prefix.

At each index, if the current mask already exists in seen, the difference i - seen[mask] gives the length of a valid substring. You never update seen[mask] once it is set, because you want the earliest occurrence to maximize the substring length.

Step-by-step walkthrough

Let's trace through the algorithm on s = "leetcode". The answer should be 5.

Initialize: mask = 0, seen = {0: -1}

0l1e2e3t4c5o6d7emask = 00000 (0)best = 0seen map:mask 0 at index -1

Before scanning, all vowel counts are zero (even). Record mask 0 at position -1.

Step 1: i = 0, char = 'l' (consonant)

0l1e2e3t4c5o6d7emask = 00000 (0)best = 1seen map:mask 0 at index -1

'l' is not a vowel, so the mask stays 00000. We saw mask 0 at index -1, so length = 0 - (-1) = 1. Best = 1.

Step 2: i = 1, char = 'e' (vowel, bit 1)

0l1e2e3t4c5o6d7emask = 00010 (2)best = 1seen map:mask 0 at index -1mask 2 at index 1

'e' toggles bit 1. Mask becomes 00010 (decimal 2). First time seeing mask 2, so store seen[2] = 1.

Step 3: i = 2, char = 'e' (vowel, toggles bit 1 back)

0l1e2e3t4c5o6d7emask = 00000 (0)best = 3seen map:mask 0 at index -1mask 2 at index 1

Second 'e' toggles bit 1 back to 0. Mask returns to 00000. We saw 0 at index -1, so length = 2 - (-1) = 3. Best = 3.

Step 4: i = 4, char = 'c' (consonant, mask unchanged)

0l1e2e3t4c5o6d7emask = 00000 (0)best = 5seen map:mask 0 at index -1mask 2 at index 1

After 't' and 'c', mask is still 00000. Seen at -1, so length = 4 - (-1) = 5. Best = 5. Substring s[0..4] = "leetc" has two e's (even).

Step 5: i = 5, char = 'o' (vowel, bit 3)

0l1e2e3t4c5o6d7emask = 01000 (8)best = 5seen map:mask 0 at index -1mask 2 at index 1mask 8 at index 5

'o' toggles bit 3. Mask becomes 01000 (decimal 8). First time seeing 8, store seen[8] = 5. Best stays 5.

Final: i = 7, char = 'e' (end of string)

0l1e2e3t4c5o6d7emask = 01010 (10)best = 5seen map:mask 0 at index -1mask 2 at index 1mask 8 at index 5mask 10 at index 7

'e' toggles bit 1 again. Mask = 01010 (10). New mask, stored. At i=6 (d), mask 8 was seen at 5, only length 1. Answer = 5.

The longest valid substring is "leetc" (indices 0 through 4), where the two e's cancel out to even and no other vowels appear. The mask stayed at 00000 from index -1 through index 4, giving length 5.

The XOR operation is the key to this pattern. XOR with 1 toggles a bit: even becomes odd, odd becomes even. Two XORs cancel out, returning the bit to its original state. This is exactly the parity tracking you need.

Complexity analysis

MetricValue
TimeO(n), single pass through the string
SpaceO(1), the seen map has at most 32 entries (2^5 possible masks)

Each character is processed once with O(1) work: a hash map lookup and possibly a bit flip. The hash map stores at most 32 entries since the mask can only take values 0 through 31. This makes the space effectively constant regardless of input size.

Building blocks

This problem decomposes into reusable building blocks that appear across many other problems:

Bitmask parity encoding

Encoding even/odd state as bits in an integer. Each bit tracks a binary property (even or odd count), and XOR toggles it. This same technique works any time you need to track parity of multiple independent counters simultaneously. You will see it in problems about subarray XOR, character frequency parity, and game state tracking.

First-occurrence hash map

Storing the first index where each state was seen, then computing the distance when the state repeats. This is the same pattern used in "Subarray Sum Equals K" with prefix sums. The general principle: if you can encode your constraint as a running state, and equal states at two positions mean the subarray between them satisfies the constraint, then a first-occurrence map gives you the longest (or shortest) valid subarray in O(n).

XOR for parity toggling

mask ^= (1 << bit_position)

XOR with a single set bit flips exactly that bit. Two flips cancel. This is the fundamental operation for tracking even/odd state without maintaining full counts.

Edge cases

No vowels at all. If s contains only consonants, the mask never changes from 0. Every position matches the initial state at -1, so the answer is the full string length. The algorithm handles this correctly.

All vowels, all the same. For s = "eeee", the mask alternates between 0 and 2. At index 3, the mask returns to 0 (four e's, even count), giving length 4. The full string is valid.

Single character. If s has length 1 and it is a consonant, the answer is 1 (mask stays 0). If it is a vowel, the answer is 0 (mask becomes nonzero and was never seen before at a different index, so no valid substring exists).

All five vowels each appearing once. For s = "aeiou", each vowel toggles a different bit. The mask at the end is 11111 (31). No intermediate mask repeats the initial 0, so the answer is 0. No substring has all vowels appearing an even number of times (except the empty substring, which does not count).

From understanding to recall

The bitmask plus hash map pattern is elegant, but it requires connecting several ideas: parity as bits, XOR for toggling, equal masks meaning even counts in between, and first-occurrence for maximizing length. In an interview, you need all of these pieces to click together quickly.

Spaced repetition helps you internalize the full chain of reasoning. You write the solution from scratch today. In a few days, you do it again. Each repetition strengthens the connection between "even counts constraint" and "bitmask + first-occurrence map." Eventually, seeing "even counts" in a problem statement immediately triggers the bitmask approach without conscious effort.

Related posts

  • Subarray Sum Equals K - Uses the same first-occurrence hash map pattern, but with prefix sums instead of bitmasks. Both problems find subarrays where a running state matches an earlier state.
  • Longest Substring Without Repeating Characters - Another "longest substring" problem, solved with sliding window instead of bitmask. Comparing the two approaches highlights when to use each technique.
  • Single Number - Uses XOR cancellation to find an unpaired element. The same XOR property (two toggles cancel out) is the foundation of the parity tracking here.

The bitmask parity pattern is one of those techniques that feels like magic the first time you see it, then becomes second nature with practice. Once you own it, an entire class of parity-based substring problems opens up.