Contiguous Array: Finding Equal 0s and 1s with Prefix Sums
LeetCode #525, Contiguous Array, asks a deceptively simple question: given a binary array, find the longest contiguous subarray with an equal number of 0s and 1s. The brute force approach is tempting but slow. The optimal solution uses a clever transformation that converts this into a prefix sum problem you can solve in a single pass.
The problem
Given a binary array nums, return the maximum length of a contiguous subarray with an equal number of 0 and 1.
nums = [0, 1] -> 2 (the whole array has one 0 and one 1)
nums = [0, 1, 0] -> 2 (either [0, 1] or [1, 0])
nums = [0, 1, 0, 0, 1, 1, 0] -> 6 (indices 0 through 5: three 0s and three 1s)
The brute force approach checks every subarray, counts 0s and 1s, and tracks the longest valid one. That is O(n^2) at best. For arrays with 100,000 elements, we need something better.
The key insight
Here is the trick that makes everything click. Replace every 0 in the array with -1. Now the problem transforms: "find the longest subarray with equal 0s and 1s" becomes "find the longest subarray with sum 0."
Why does this work? In the original array, each 0 contributes 0 to the sum and each 1 contributes 1. After the transformation, each original-0 contributes -1 and each original-1 contributes +1. A subarray with equal counts of 0s and 1s has a sum of count_1 * 1 + count_0 * (-1) = 0, since count_1 == count_0.
Now we can use prefix sums. If the prefix sum at index j equals the prefix sum at index i, then the subarray from i+1 to j has a sum of 0, meaning it has equal 0s and 1s. The length of that subarray is j - i.
To find the longest such subarray, we store the first index where each prefix sum appears in a hash map. When we encounter a prefix sum we have seen before, we compute the gap. We keep only the first occurrence of each prefix sum because that maximizes the subarray length.
We seed the hash map with {0: -1} before scanning. This handles the case where the prefix sum itself reaches 0, meaning the subarray from the very start of the array has equal 0s and 1s. Without this seed, we would miss those subarrays entirely.
The solution
def findMaxLength(nums):
prefix_map = {0: -1}
prefix_sum = 0
max_len = 0
for i, num in enumerate(nums):
prefix_sum += 1 if num == 1 else -1
if prefix_sum in prefix_map:
max_len = max(max_len, i - prefix_map[prefix_sum])
else:
prefix_map[prefix_sum] = i
return max_len
Let us walk through how it works:
- Transform on the fly. Instead of modifying the array, we add
+1for a1and-1for a0directly to the running prefix sum. - Check the map. If this prefix sum was seen before at some earlier index, the subarray between that index and the current one has a sum of 0 (equal 0s and 1s). Compute the length and update the max.
- Store first occurrence only. If the prefix sum is new, record the current index. We never overwrite an existing entry because the first occurrence gives us the longest possible subarray.
- Seed with
{0: -1}. This handles subarrays that start from index 0.
Step-by-step walkthrough
Let us trace through nums = [0, 1, 0, 0, 1, 1, 0] step by step. Watch how the prefix sum rises and falls, and how repeated values in the hash map reveal subarrays with equal 0s and 1s.
Step 0: Initialize prefix_map with {0: -1}
max_length = 0. We seed the map with prefix sum 0 at index -1. If the prefix sum ever returns to 0, the subarray from the start has equal 0s and 1s.
Step 1: i=0, nums[0]=0, treat as -1. prefix_sum = -1
max_length = 0. Prefix sum -1 is not in the map. Store {-1: 0}.
Step 2: i=1, nums[1]=1, keep as 1. prefix_sum = 0
max_length = 2. Prefix sum 0 IS in the map at index -1. Length = 1 - (-1) = 2. One 0 and one 1 from index 0 to 1.
Step 3: i=2, nums[2]=0, treat as -1. prefix_sum = -1
max_length = 2. Prefix sum -1 IS in the map at index 0. Length = 2 - 0 = 2. Does not beat current max of 2.
Step 4: i=3, nums[3]=0, treat as -1. prefix_sum = -2
max_length = 2. Prefix sum -2 is not in the map. Store {-2: 3}.
Step 5: i=4, nums[4]=1, keep as 1. prefix_sum = -1
max_length = 4. Prefix sum -1 IS in the map at index 0. Length = 4 - 0 = 4. New max! Subarray from index 1 to 4 has two 0s and two 1s.
Step 6: i=5, nums[5]=1, keep as 1. prefix_sum = 0
max_length = 6. Prefix sum 0 IS in the map at index -1. Length = 5 - (-1) = 6. New max! Subarray from index 0 to 5 has three 0s and three 1s.
Step 7: i=6, nums[6]=0, treat as -1. prefix_sum = -1
max_length = 6. Prefix sum -1 IS in the map at index 0. Length = 6 - 0 = 6. Ties current max. Done! Answer is 6.
The algorithm finds the answer at step 6 when the prefix sum returns to 0. The subarray from index 0 to 5 contains three 0s and three 1s, giving a length of 6.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Brute force (all subarrays) | O(n^2) | O(1) |
| Prefix sum + hash map | O(n) | O(n) |
Time: O(n). We iterate through the array once. Each hash map lookup and insertion is O(1) on average. Total: one pass, linear time.
Space: O(n). In the worst case, every prefix sum is unique and we store n + 1 entries in the hash map. In practice, the prefix sum values range from -n to n, so the map could hold up to 2n + 1 entries at most.
The building blocks
The 0-to-negative-1 transformation
prefix_sum += 1 if num == 1 else -1
This transformation converts a "balance between two categories" problem into a "find subarray with sum 0" problem. Any time you need to find a subarray where two types of elements appear equally, this trick applies. Replace one type with -1 and the other with +1, then look for sum-zero subarrays. The same idea works in problems involving equal counts of different characters or balanced partitions.
First-occurrence prefix sum map
if prefix_sum in prefix_map:
max_len = max(max_len, i - prefix_map[prefix_sum])
else:
prefix_map[prefix_sum] = i
By storing only the first index where each prefix sum appears, we maximize the subarray length whenever we see a repeat. This is the same pattern used in Continuous Subarray Sum (LeetCode #523), where you store the first occurrence of each remainder. The principle is the same: earlier first occurrences give you longer subarrays.
Edge cases
- All zeros or all ones. The prefix sum only moves in one direction and never repeats. The answer is 0 because no subarray has equal counts.
- Array of length 1. You need at least two elements for a valid subarray. Return 0.
[0, 1]or[1, 0]. Both give a prefix sum that returns to 0 at index 1, so the answer is 2.- Entire array is balanced. If
numshas equal 0s and 1s overall, the prefix sum ends at 0. The map seed at -1 catches this, and the answer isn. - Multiple subarrays tied for longest. We only need the length, not the actual subarray. The first-occurrence strategy naturally finds the maximum.
From understanding to recall
The 0-to-negative-1 trick feels elegant once you see it. "Of course, just turn 0 into -1 and find subarrays that sum to 0." But in an interview, nobody is going to hint at the transformation. You need to make the connection between "equal counts of two things" and "prefix sum equals zero" on your own.
Spaced repetition is how you make that connection automatic. You practice writing the solution from memory at increasing intervals. After a few review cycles, you see "binary array, equal 0s and 1s" and your hands reach for the prefix sum map before your brain finishes reading the problem. That is the difference between understanding a pattern and owning it.
Related posts
- Two Sum - The foundational hash map complement lookup pattern
- Continuous Subarray Sum - Same prefix sum plus hash map technique with a modular arithmetic twist
- Maximum Subarray - Another prefix sum approach to subarray problems, using Kadane's algorithm