Intersection of Two Arrays: Set Lookup for Unique Common Elements
LeetCode 349 - Intersection of Two Arrays asks a deceptively simple question: given two integer arrays, return their intersection. The twist is that each element in the result must be unique, even if it appears multiple times in both inputs.
For example, given nums1 = [4, 9, 5] and nums2 = [9, 4, 9, 8, 4], the answer is [9, 4]. Both 9 and 4 appear in both arrays, but each shows up only once in the result. The 8 from nums2 has no match in nums1, so it is excluded. The repeated 9 and 4 in nums2 are counted only once.
The key constraint is uniqueness. This is what separates 349 from its sibling problem, Intersection of Two Arrays II (LeetCode 350), which allows duplicates in the result based on how many times an element appears in both arrays. For this problem, a set is the natural tool.
The approach
The solution has two steps:
- Convert
nums1into a set. This gives you an O(1) membership check for any value. - Iterate through
nums2. For each element, check whether it is inset1. If it is, add it to a result set. Using a set for results automatically handles deduplication.
That is it. One pass through nums2, one O(1) check per element.
The reason you convert nums1 (not nums2) is that you want to look up values from nums2 against the first array. You could also go the other direction and build a set from nums2, then iterate nums1. Either works. The convention is to build the set from the smaller array to minimize space usage, but both give the same time complexity.
The code
def intersection(nums1, nums2):
set1 = set(nums1)
return list(set(n for n in nums2 if n in set1))
The generator expression n for n in nums2 if n in set1 filters nums2 to only elements present in set1. Wrapping that in set(...) deduplicates the results. list(...) converts back to the required output type.
Step-by-step walkthrough
Let's trace through nums1 = [4, 9, 5] and nums2 = [9, 4, 9, 8, 4] one step at a time.
Convert nums1 to a set for O(1) lookup
[4, 9, 5]{4, 9, 5}Building a set from nums1 takes O(m) time and O(m) space, where m = len(nums1). Every membership check against set1 is now O(1).
Iterate nums2 and collect matches into a result set
9 in set1 → yes, add to result. result = {9}4 in set1 → yes, add to result. result = {4, 9}9 in set1 → yes, but 9 already in result (set deduplicates)8 in set1 → no, skip4 in set1 → yes, but 4 already in result (set deduplicates)Using a set for results automatically handles duplicates. Even though 9 and 4 appear multiple times in nums2, each ends up in the result at most once.
Return the result set as a list
{4, 9}list({4, 9}) → [4, 9]Convert the result set to a list. The problem states any order is valid, so the set-to-list conversion is fine.
Notice that even though 9 and 4 both appear twice in nums2, they each land in the result set only once. The set handles deduplication for free. You do not need any extra bookkeeping.
Complexity
| Complexity | Explanation | |
|---|---|---|
| Time | O(m + n) | O(m) to build set1, O(n) to iterate nums2 |
| Space | O(m) | set1 holds at most m unique elements from nums1 |
Here m = len(nums1) and n = len(nums2). If you always convert the smaller array to a set, you can say space is O(min(m, n)), but O(m) is the standard way to express it when you treat nums1 as the set source.
Compare this to a brute force approach where you check every element of nums2 against every element of nums1: that would be O(m * n) time. For large inputs, the set-based solution is dramatically faster.
Building blocks
This problem combines two reusable patterns:
Hash set membership. Converting a collection to a set so you can check membership in O(1). This is the same primitive used in Two Sum (where you store seen values), Contains Duplicate (where you detect repeats), and many sliding window problems. Any time an inner loop is just searching for a value you have already seen, a set or dict can collapse it to O(1).
Set deduplication. Using a set as the output container to eliminate duplicates automatically. Rather than tracking "have I already added this to the result?" with a separate variable, you let the set's own properties handle it. This is a small but useful trick that simplifies code whenever you need unique results from a filtered pass.
Edge cases
A few scenarios worth thinking through:
- No common elements. If the two arrays share nothing,
set1is populated but nothing from nums2 passes the membership check. The result is an empty list. The code handles this with no special case needed. - All elements match. If every element in nums2 appears in nums1, the result set collects all unique values from nums2. Again, no special handling.
- One array is empty. If
nums1 = [], thenset1is empty, every membership check fails, and the result is an empty list. Ifnums2 = [], the iteration does nothing and the result is empty. Both work without modification. - Duplicates within each array. The problem does not say inputs are unique.
nums1 = [4, 4, 5]is valid. Converting to a set collapses those duplicates, soset1 = {4, 5}. This is correct behavior: you care whether 4 is in nums1 at all, not how many times it appears.
Contrast with Intersection of Two Arrays II (LeetCode 350)
LeetCode 350 changes the rules: each element in the result should appear as many times as it appears in both arrays. So for nums1 = [1, 2, 2, 1] and nums2 = [2, 2], the result is [2, 2] because 2 appears twice in both.
For that problem, a plain set does not work. You need to count frequencies, typically with a hash map or Counter. The core idea is similar (convert one array to a lookup structure, iterate the other) but the data structure changes from a set to a frequency map.
If you nail 349 first, 350 is a natural extension. The same structure, a different lookup container.
From understanding to recall
This problem feels easy once you see the set-based solution. And it is easy. But "easy" and "locked in" are different things. A few weeks from now, under interview pressure, there is a real chance you hesitate on whether to use a set or a list, or forget that you need to deduplicate the result, or reach for a sorting-based approach out of habit.
Spaced repetition fixes this. You practice writing the solution from scratch at increasing intervals: one day, three days, a week, two weeks. Each session, you reconstruct the code from memory rather than re-reading it. After a few rounds, the pattern is automatic. You do not think about it. You just reach for it.
The two patterns here, hash set membership and set-based deduplication, are worth drilling as individual building blocks. They combine in this problem, but they also combine with other bricks in harder problems. Getting them into muscle memory is the point.
Related posts
- Two Sum - The complement lookup pattern, built on the same hash set membership check used here
- Contains Duplicate - The simplest hash set problem, a good warm-up for understanding set-based O(1) lookup
- Group Anagrams - Hash map grouping, another core hash map technique that extends beyond simple membership