Find Longest Awesome Substring: Bitmask Parity for Palindrome Substrings
LeetCode Find Longest Awesome Substring (Problem 1542) asks you to find the longest substring of a digit string that can be rearranged into a palindrome. The trick is recognizing that palindrome rearrangement is entirely about frequency parities, and you can track those parities with a bitmask.
The problem
Given a string s consisting of digits '0' through '9', return the length of the longest substring that can be rearranged to form a palindrome.
A string can be rearranged into a palindrome if at most one character has an odd frequency. For example, "24241" can become "42142" or "24142", both palindromes, because digits 2 and 4 each appear twice (even) and digit 1 appears once (odd, but only one such digit).
The brute force approach
You could check every substring: for each pair of indices (i, j), count the character frequencies and check whether at most one has an odd count. There are O(n^2) substrings, and counting frequencies takes O(n) for each one (or O(1) if you maintain a running count). Even the optimized O(n^2) version is too slow when n can reach 100,000.
The key insight
You do not need the exact frequencies. You only need the parity (odd or even) of each digit's frequency. There are 10 possible digits (0 through 9), so you can represent the parity state as a 10-bit bitmask. Bit d is 1 if digit d has appeared an odd number of times, and 0 if even.
When you process a character, you flip its corresponding bit. The bitmask after processing index j encodes the parity of every digit's frequency in s[0..j].
Here is the critical connection: the parity state of a substring s[i+1..j] equals mask[j] XOR mask[i]. If mask[j] XOR mask[i] has at most one bit set, then substring s[i+1..j] can be rearranged into a palindrome.
So the problem reduces to: for each position j, find the earliest position i where mask[i] equals mask[j] (zero bits differ, meaning all parities are even) or mask[i] differs from mask[j] by exactly one bit (one odd-frequency character allowed).
This is the same "prefix state + hash map" pattern you see in subarray sum problems, but instead of prefix sums you are tracking prefix parity bitmasks. Store the first occurrence of each mask value, and for each new position, check whether the current mask or any mask one bit away has been seen before.
Step-by-step walkthrough
Let's trace through s = "3242415". We maintain a bitmask mask that tracks digit frequency parities, and a hash map first_seen that records the first index where each mask value appeared. We initialize first_seen[0] = -1 because the empty prefix has mask 0.
Initialize: mask = 0, first_seen = {0: -1}
Before processing any character, the mask is 0 (all parities even). Store first occurrence of mask 0 at index -1.
Step 1: Process s[0] = '3'
Flip bit 3. mask = 8. Check mask XOR (1<<3) = 0, which was seen at -1. Length = 0 - (-1) = 1. best = 1.
Step 2: Process s[1] = '2'
Flip bit 2. mask = 12. No matching mask or one-bit-off mask improves best. best stays 1.
Step 3: Process s[2] = '4'
Flip bit 4. mask = 28. No improvements found. best stays 1.
Step 4: Process s[3] = '2'
Flip bit 2. mask = 24. Check mask XOR (1<<4) = 8, seen at 0. Length = 3 - 0 = 3. best = 3.
Step 5: Process s[4] = '4'
Flip bit 4. mask = 8, already seen at 0. Length = 4 - 0 = 4. Also mask XOR (1<<3) = 0, seen at -1. Length = 4 - (-1) = 5. best = 5.
Step 6: Process s[5] = '1'
Flip bit 1. mask = 10. Check mask XOR (1<<1) = 8, seen at 0. Length = 5 - 0 = 5. No improvement. best stays 5.
Step 7: Process s[6] = '5'
Flip bit 5. mask = 42. No match improves best. Final answer = 5.
The longest awesome substring has length 5: "32424" (indices 0 through 4), where mask returned to 8 and mask XOR (1<<3) = 0 was first seen at -1, giving length 4 - (-1) = 5.
The code
def longestAwesome(s: str) -> int:
first_seen = {0: -1}
mask = 0
best = 0
for i, ch in enumerate(s):
mask ^= 1 << int(ch)
# Check if current mask was seen before (all parities even)
if mask in first_seen:
best = max(best, i - first_seen[mask])
else:
first_seen[mask] = i
# Check masks that differ by exactly one bit
for d in range(10):
flipped = mask ^ (1 << d)
if flipped in first_seen:
best = max(best, i - first_seen[flipped])
return best
At each position, you flip the bit for the current digit and then perform two checks. First, you see if the exact same mask appeared earlier, which means every digit in the substring has even frequency. Second, you try flipping each of the 10 bits to check if there is an earlier position where the mask differed by exactly one bit, which means exactly one digit has odd frequency. In both cases, you want the earliest occurrence to maximize substring length.
Why store only the first occurrence? Because you want the longest substring. If the same mask appears at indices 2 and 7, using the earlier index gives a longer substring. Later occurrences of the same mask never produce a better result.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Brute force (check all substrings) | O(n^2) | O(1) |
| Bitmask + hash map | O(10n) = O(n) | O(2^10) = O(1) |
Time: O(n). You scan the string once. At each position, you perform at most 11 hash map lookups (one for the current mask plus 10 for single-bit flips). Each lookup is O(1), so the total is O(11n), which simplifies to O(n).
Space: O(1). The mask is a 10-bit integer, so there are at most 1024 possible mask values. The hash map stores at most 1024 entries. This is constant with respect to the input size.
The building blocks
Bitmask parity tracking
The core technique is using a bitmask to track the parity of character frequencies. Each bit represents whether a particular character has been seen an odd or even number of times. XOR is the natural operation here because x XOR x = 0: toggling a bit twice returns it to its original state, which is exactly how parity works.
This same approach appears in problems like Pseudo-Palindromic Paths in a Binary Tree, where you track digit parities along root-to-leaf paths using a bitmask. Any time a problem asks about frequency parities, a bitmask should come to mind.
Prefix state + first occurrence map
The pattern of storing prefix states in a hash map and looking up complementary states is a generalization of the prefix sum technique. In Subarray Sum Equals K, you store prefix sums and look for prefix_sum - k. Here, you store prefix parity masks and look for masks that are equal or one bit away. The underlying structure is the same: you are looking for a pair of prefix states whose difference (or XOR) satisfies some condition.
Edge cases
Single character. Any single character is trivially a palindrome (it reads the same forwards and backwards). The algorithm handles this because at index 0, mask ^ (1 << d) for the current digit gives 0, which is in first_seen at -1, producing length 1.
All characters the same. If s = "1111", then the mask toggles between 0 and 2. At index 1, mask returns to 0, which was first seen at -1, giving length 2. At index 3, mask returns to 0 again, giving length 4. The entire string is awesome because every character has even frequency (or, for odd lengths, one has odd frequency). The algorithm correctly returns n.
All characters distinct. If s = "0123456789", the mask accumulates set bits. The best you can do is a single character (length 1) or any substring where you allow one odd-frequency character. The algorithm will check single-bit-flip masks and find the best possible result.
String of length 1. The answer is always 1. The initialization first_seen[0] = -1 ensures this works.
From understanding to recall
The main challenge in this problem is connecting three ideas: (1) palindrome rearrangement depends only on frequency parities, (2) you can encode parities as a bitmask, and (3) the prefix state + hash map pattern lets you find optimal substrings in linear time. Each idea is simple on its own, but combining them requires practice.
When you drill this problem, focus on the moment you translate "at most one odd frequency" into "bitmask with at most one bit set." That is the conceptual leap. If you can articulate why XOR of two prefix masks gives the parity state of the substring between them, you own the technique.
Also notice the similarity to other prefix-state problems. The hash map stores first occurrences, not counts, because you want the longest (not the count of) matching substrings. This is a small but critical detail that differs from Subarray Sum Equals K.
Related posts
- Subarray Sum Equals K - The classic prefix state + hash map pattern. Instead of bitmasks, you store prefix sums and look for complementary values. The structural similarity is strong.
- Bitwise ORs of Subarrays - Another problem where bitwise operations on subarrays reveal structure. OR is monotonic while XOR toggles, but both exploit bounded bit widths for efficiency.
- Can Make Palindrome from Substring - Uses the same bitmask parity technique to answer palindrome queries on substrings. A direct relative of this problem with a query-based twist.