Skip to content
← All posts

Find the Difference of Two Arrays: Set Operations

5 min read
leetcodeproblemeasyarrayshash-map

LeetCode 2215. Find the Difference of Two Arrays is an easy problem that directly tests your understanding of set operations. If you know what "set difference" means, you can solve it in three lines. If you do not, you will end up writing nested loops and wondering why your solution feels clunky.

This is one of those problems where the optimal solution is also the simplest one.

The problem

Given two integer arrays nums1 and nums2, return a list answer of size 2 where:

  • answer[0] is a list of all distinct integers in nums1 that are not present in nums2.
  • answer[1] is a list of all distinct integers in nums2 that are not present in nums1.

The integers in each list may be returned in any order.

Example: nums1 = [1, 2, 3], nums2 = [2, 4, 6]. The answer is [[1, 3], [4, 6]]. The value 2 appears in both arrays, so it is excluded from both result lists. The values 1 and 3 are only in nums1, and the values 4 and 6 are only in nums2.

nums1123unique to nums1sharednums2246unique to nums2sharedresultanswer[0]:13answer[1]:46
Elements unique to nums1 (accent) go into answer[0]. Elements unique to nums2 (orange) go into answer[1]. Shared elements (green) are excluded from both.

Why sets

The brute force approach would be: for each element in nums1, loop through nums2 to check if it exists there. That gives you O(n * m) time, which is slow and unnecessary.

The key insight is that checking membership in a list is O(n), but checking membership in a set is O(1). If you convert both arrays to sets first, every membership check becomes constant time. You also get deduplication for free, which the problem requires since it asks for distinct integers.

This is the classic tradeoff: spend O(n) time and space upfront to build a set, then enjoy O(1) lookups for every query after that.

The approach: set difference

Set difference is a fundamental operation. Given two sets A and B, the difference A - B produces a new set containing all elements that are in A but not in B. Python has a built-in operator for this: set1 - set2.

That is the entire algorithm:

  1. Convert nums1 to set1 and nums2 to set2.
  2. Compute set1 - set2 to get elements unique to nums1.
  3. Compute set2 - set1 to get elements unique to nums2.
  4. Return both as lists.

No sorting, no nested loops, no frequency counting. Just two set conversions and two set differences.

Visual walkthrough

Step 1: Convert nums1 to set1

nums1123set()set1{123}

Convert nums1 = [1, 2, 3] into a set. Sets remove duplicates and allow O(1) lookups.

Step 2: Convert nums2 to set2

nums2246set()set2{246}

Convert nums2 = [2, 4, 6] into a set. Same O(n) operation.

Step 3: Compute set1 - set2

set1123-set2246answer[0]:13

For each element in set1, check if it is NOT in set2. 1 is not in set2, 2 IS in set2 (skip), 3 is not in set2. Result: {1, 3}.

Step 4: Compute set2 - set1

set2246-set1123answer[1]:46

For each element in set2, check if it is NOT in set1. 2 IS in set1 (skip), 4 is not in set1, 6 is not in set1. Result: {4, 6}.

Result: [[1, 3], [4, 6]]

answer[0]:[13]answer[1]:[46]

answer[0] contains elements unique to nums1. answer[1] contains elements unique to nums2. Done in O(n + m) time.

Each step is O(n) or O(m). You build each set in linear time, and computing the set difference iterates through one set while doing O(1) lookups in the other. The total work is O(n + m).

The solution

def findDifference(nums1, nums2):
    set1 = set(nums1)
    set2 = set(nums2)
    return [list(set1 - set2), list(set2 - set1)]

This is a 3-line solution. Here is what each line does:

  • Line 1: Convert nums1 to a set. This removes duplicates and enables O(1) lookups. If nums1 = [1, 1, 2, 3], then set1 = {1, 2, 3}.
  • Line 2: Convert nums2 to a set. Same operation, same benefits.
  • Line 3: Compute both set differences and convert them back to lists. set1 - set2 gives elements in set1 not found in set2. set2 - set1 gives the reverse.

Converting to sets also handles duplicates in the input arrays. The problem asks for distinct values, and sets naturally deduplicate. If nums1 = [1, 1, 2] and nums2 = [2, 3], then set1 = {1, 2} and the answer is [[1], [3]]. You do not need a separate deduplication step.

Complexity analysis

MetricValueReasoning
TimeO(n + m)Building each set is O(n) and O(m). Set difference is O(min(n, m)).
SpaceO(n + m)Two sets storing up to n + m elements total.

The building blocks

Set difference for membership queries

When you need to find elements in one collection that are absent from another, convert both to sets and use set difference. This transforms an O(n * m) problem into an O(n + m) one. The pattern is always the same: build the sets, then subtract.

This pattern shows up frequently beyond LeetCode. In database operations, a LEFT JOIN with a NULL check is conceptually the same as set difference. When comparing two versions of a configuration file, you compute which keys were added (new - old) and which were removed (old - new). Any time you hear "find what is in A but not in B," set difference should be your first thought.

Edge cases

  • No overlap. Both sets are completely disjoint. answer[0] = all of nums1 (distinct), answer[1] = all of nums2 (distinct).
  • Complete overlap. Both arrays have the same elements. answer = [[], []].
  • Duplicates in input. nums1 = [1, 1, 2], nums2 = [2, 3]. Sets deduplicate: answer = [[1], [3]].
  • Empty arrays. If nums1 is empty, answer[0] = []. If nums2 is empty, answer[1] = [].
  • One element each. Simple comparison of two values. Either they match (both results empty) or they differ (each goes into its respective answer list).

From understanding to recall

This problem is easy conceptually. You read the description, you think "set difference," and you write three lines of code. But the real value is in recognizing the set difference pattern instantly. In interviews, you should reach for set operations the moment you hear "elements in one but not the other." You should not need to think about it, and you should not waste time considering nested loops first.

Practice until the set conversion is automatic. When you see two collections and a question about what is unique to each one, the words set() and - should appear in your mind before you even finish reading the problem. That kind of instant pattern recognition is what separates someone who solves easy problems slowly from someone who breezes through them and saves their brainpower for the hard ones.

Related posts