Intersection of Two Arrays II: Frequency Counting for Duplicate-Aware Intersection
LeetCode 350 asks a deceptively simple question: given two arrays, return their intersection. The catch? Duplicates count.
Given nums1 = [1, 2, 2, 1] and nums2 = [2, 2], the answer is [2, 2]. Not [2]. Both 2s show up, because 2 appears at least twice in each array.
This is what makes problem 350 different from problem 349. Problem 349 uses a set, which collapses duplicates. Here, you need to track how many times each element appears in both arrays and return as many copies as the minimum count allows. A set will not cut it.
The approach: frequency counting
The cleanest solution uses a frequency map (a Counter in Python) built from one of the arrays. Then you iterate the other array and for each element, if the counter still has a remaining count for that element, you take it.
Here is the algorithm step by step:
- Build a
Counterfromnums1. This gives you{1: 2, 2: 2}. - Iterate through
nums2. For each elementn:- If
counter[n] > 0, appendnto the result and decrementcounter[n]. - Otherwise, skip it.
- If
- Return the result.
The decrement is the key move. Each time you claim a copy of an element, you reduce its available count. So if nums1 has two 2s and nums2 has three 2s, you will only claim two of them.
The code
from collections import Counter
def intersect(nums1, nums2):
count = Counter(nums1)
result = []
for n in nums2:
if count[n] > 0:
result.append(n)
count[n] -= 1
return result
Step-by-step walkthrough
Let's trace through nums1 = [1, 2, 2, 1] and nums2 = [2, 2].
Step 1: Build a frequency counter from nums1 = [1, 2, 2, 1]
Count each element in nums1. 1 appears twice, 2 appears twice. Store these counts for O(1) lookup.
Step 2: Process nums2[0] = 2. counter[2] = 2 > 0, so add to result.
2 has a remaining count of 2 in the counter. We add 2 to the result and decrement counter[2] to 1.
Step 3: Process nums2[1] = 2. counter[2] = 1 > 0, so add to result.
2 still has a remaining count of 1. We add 2 again and decrement counter[2] to 0. Both elements of nums2 matched.
Step 4: Done. Result = [2, 2]
nums2 is fully processed. The result contains both 2s because 2 appeared at least twice in both arrays.
After processing both elements of nums2, the result is [2, 2]. The counter for 2 drops to 0, so any additional 2s in nums2 would be skipped.
Complexity analysis
| Complexity | |
|---|---|
| Time | O(m + n) |
| Space | O(min(m, n)) |
Time: Building the counter from nums1 costs O(m). Iterating nums2 costs O(n). Each lookup and decrement is O(1). Total: O(m + n).
Space: The counter holds at most as many distinct entries as the smaller array has elements. If you build the counter from the smaller array, it costs O(min(m, n)).
The sorted two-pointer alternative
If both arrays are already sorted, there is a second approach that avoids extra space entirely.
Use two pointers, one for each array, starting at the beginning. Compare the elements:
- If they are equal, add the element to the result and advance both pointers.
- If the element in
nums1is smaller, advance thenums1pointer. - If the element in
nums2is smaller, advance thenums2pointer.
This runs in O(m + n) time and O(1) extra space (beyond the output). The trade-off is that you pay O(m log m + n log n) to sort if the arrays are not already sorted, which makes it slower than the counter approach in the general case. But if sorted input is guaranteed, it is a clean and memory-efficient option.
Building blocks
This problem is a direct application of the frequency map pattern. You count occurrences of elements in one collection and then consume those counts as you process a second collection. The decrement-on-match mechanic is what makes it work correctly for duplicates.
Contrast this with the set-based intersection from problem 349. A set only tells you whether an element appears at all. It cannot tell you how many times. Frequency maps give you both existence and count, which is exactly what you need when duplicates matter.
The pattern generalizes. Any time a problem cares about how many times something appears (not just whether it appears), reach for a Counter.
Edge cases
A few situations worth keeping in mind:
- No overlap. If the arrays share no elements, the counter check always fails and the result is empty.
- One array is empty. Building a counter from an empty array produces an empty counter. The result will always be empty.
- All elements match. If
nums1 = [2, 2]andnums2 = [2, 2], the result is[2, 2]. Both copies are claimed. - Repeated elements in only one array. If
nums1 = [2, 2, 2]andnums2 = [2], the result is[2]. Only one copy matches, becausenums2only has one 2.
Interview follow-ups
Interviewers often follow up with these questions:
What if the arrays are already sorted? Use two pointers as described above. No hash map needed, and you get O(1) extra space. Time is still O(m + n) after sorting.
What if nums1 fits in memory but nums2 is huge?
Build the counter from nums1 (the smaller array that fits in memory), then stream nums2 one element at a time. You never need to hold all of nums2 in memory at once. This is the key follow-up that separates candidates who understand the algorithm from candidates who just memorized it.
Which array should you build the counter from? Always the smaller one. This minimizes the space used by the counter. The iteration over the larger array is just a read, with no additional memory beyond the output.
From understanding to recall
This problem feels obvious once you see it. "Of course you count the frequencies and decrement." But three weeks from now, if a slightly different version shows up in an interview, the question is whether you reach for this pattern immediately or spend five minutes rediscovering it.
That gap between understanding and automatic recall is what spaced repetition closes. You practice reconstructing the solution from scratch at increasing intervals. After a few cycles, the Counter-plus-decrement mechanic is not something you remember. It is something you just do.
Related posts
- Intersection of Two Arrays - The set-based version (problem 349). Same name, different problem: duplicates do not count there.
- Two Sum - The complement lookup pattern. Another problem where a hash map turns O(n^2) into O(n).
- Group Anagrams - The grouping pattern. A more complex frequency map application where the key is a character count.