Favorite Companies: Subset Checking with Sets
LeetCode People Whose List of Favorite Companies Is Not a Subset of Another List (problem 1452) gives you a list of lists called favoriteCompanies, where favoriteCompanies[i] is the list of favorite companies for person i. You need to return the indices of people whose list is not a subset of any other person's list.
For example, given favoriteCompanies = [["leetcode","google","facebook"], ["google","microsoft"], ["google","facebook"], ["google"], ["amazon"]], the output is [0, 1, 4]. Person 2's favorites {"google", "facebook"} are entirely contained in Person 0's list, and Person 3's {"google"} is also a subset of Person 0's. So those two get filtered out.
The key insight
The brute force would be to compare every element of one list against every element of another list, but that would be extremely slow for large inputs. The trick is to convert each person's list into a set. Once you have sets, checking whether one is a subset of another becomes a single operation: for every element in the smaller set, check if it exists in the larger set. Since set membership checks are O(1), this turns each pairwise comparison from O(m^2) into O(m), where m is the size of the smaller set.
There is also a useful optimization: if len(set_i) is greater than or equal to len(set_j), then set_i cannot possibly be a subset of set_j. You can skip those comparisons entirely.
The solution
def peopleIndexes(favoriteCompanies):
sets = [set(companies) for companies in favoriteCompanies]
n = len(sets)
result = []
for i in range(n):
is_subset = False
for j in range(n):
if i != j and len(sets[i]) <= len(sets[j]):
if sets[i] <= sets[j]:
is_subset = True
break
if not is_subset:
result.append(i)
return result
The first line converts every list into a set. Then for each person i, we check if their set is a subset of any other person j's set. The <= operator on Python sets means "is subset of." If we find a match, we mark it and break early. If no superset is found, person i goes into the result.
The len check is important. If your set has 5 elements and the other set has 3, there is no way yours fits inside theirs. This prunes many unnecessary subset checks.
Python's set supports <= for subset checking and >= for superset checking. These operators make the code much cleaner than manually iterating and checking membership. Under the hood, A <= B iterates through every element of A and checks element in B, which is O(len(A)) on average.
Visual walkthrough
Step 1: Start with the raw input lists
Each person has a list of favorite companies. We need to find which people's lists are NOT a subset of any other person's list.
Step 2: Convert each list to a set for O(1) lookups
Sets let us check membership in constant time. This makes the subset check efficient.
Step 3: Check P0. Is S0 a subset of any other set?
S0 has 3 elements. No other set is larger, so S0 cannot be a subset of anyone. P0 is safe.
Step 4: Check P2. Is S2 a subset of any other set?
S2 = {"google", "facebook"}. Both elements exist in S0. S2 is a subset of S0, so P2 is excluded.
Step 5: Check P3. Is S3 = {"google"} a subset of any other set?
"google" appears in S0, S1, and S2. S3 is a subset of all three. P3 is excluded.
Step 6: Collect all non-subset indices
P0, P1, and P4 passed all checks. Their lists are not subsets of any other person's list.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Set subset checking | O(n^2 * m) | O(n * m) |
The time complexity is O(n^2 * m), where n is the number of people and m is the maximum size of a person's favorite list. For each of the n people, we compare against up to n-1 others, and each subset check costs O(m) in the worst case.
The space complexity is O(n * m) for storing the n sets. Each set holds up to m company names.
The building blocks
1. Converting lists to sets
The act of transforming a list into a set is a foundational operation in problems involving membership testing, deduplication, or subset relationships. The conversion itself is O(m), and it unlocks O(1) lookups for every subsequent query. You will see this pattern in many problems where you need to repeatedly check "does this element exist in that collection?"
my_set = set(my_list)
2. Pairwise subset checking
When you need to compare every pair of items in a collection, the double loop pattern is the standard approach. The outer loop picks one item, the inner loop checks it against all others. The key is knowing when to prune: skip pairs where the candidate cannot possibly be a subset due to size, and break early once you find a match.
for i in range(n):
for j in range(n):
if i != j and condition(i, j):
# found a match
break
Edge cases
- All lists are identical. Every person's set equals every other person's set, so every set is a subset of another. The result is empty.
- All lists are unique with no overlap. No set can be a subset of another. Every index appears in the result.
- Single person. With only one list, there is nobody else to be a superset. Return
[0]. - Empty lists. An empty set is a subset of every other set. Any person with an empty list will be excluded unless every other person also has an empty list.
- Large number of companies. The set-based approach handles this well since each membership check remains O(1) regardless of how many companies exist.
From understanding to recall
The core pattern here is "convert to sets, then do pairwise subset checks." It sounds simple once you see it, but under pressure you might forget the len optimization or reach for a less efficient approach. Practice writing the double loop with the size guard until it feels automatic.
The subset operator <= on sets is worth committing to muscle memory. It saves you from writing an explicit loop to check every element, and it reads naturally: "is set A contained within set B?"
Related posts
- Group Anagrams - Another problem using sets/maps for grouping
- Contains Duplicate - Set-based membership checking
- Subsets - Understanding subset enumeration
This problem is a clean example of how the right data structure turns an expensive operation into a cheap one. Converting lists to sets is the entire insight. Once you have sets, Python's built-in operators do the heavy lifting. If you want to make sure this pattern sticks, practice it with spaced repetition so the set conversion and pairwise comparison become second nature.