Insert Delete GetRandom O(1) - Duplicates Allowed
LeetCode 381 asks you to design a collection that supports three operations, all in average O(1) time:
insert(val)- insertsvalinto the collection. Returnstrueifvalwas not already present.remove(val)- removes one instance ofvalfrom the collection. Returnstrueifvalwas present.getRandom()- returns a random element. Each element must have a probability proportional to the number of copies in the collection.
This is the follow-up to LeetCode 380 (no duplicates). The twist is that duplicates are allowed, and getRandom must reflect duplicate counts. If you have three copies of 1 and one copy of 2, getRandom should return 1 with probability 3/4.
The key insight
In the simpler version (#380), a hash map from value to index gives you O(1) lookup and O(1) removal via swap-and-pop. But that breaks with duplicates because a single value can live at multiple array positions.
The fix: change the hash map so it stores each value mapped to a set of indices. When you insert a duplicate, you append to the array and add the new index to that value's set. When you remove, you pick any index from the value's set, swap the element at that index with the last array element, update both sets, and pop from the end.
The array gives you O(1) random access for getRandom. The hash map of sets gives you O(1) existence checks and O(1) index lookups for removals. Together they handle duplicates without losing constant-time guarantees.
The approach
Maintain two data structures side by side:
- A list
numsthat stores every inserted element (duplicates included). - A dict
idx_mapthat maps each value to a set of indices where it appears innums.
For insert(val):
- Check if
valis already inidx_map(and its set is non-empty). Save this for the return value. - Append
valtonums. - Add the new index (
len(nums) - 1) toidx_map[val]. - Return
trueifvalwas not already present,falseotherwise.
For remove(val):
- If
valis not inidx_mapor its set is empty, returnfalse. - Pop any index
ifromidx_map[val]. - Let
lastbe the last element innumsandlast_idxbelen(nums) - 1. - If
iis notlast_idx, swapnums[i]withnums[last_idx]. Updateidx_map[last]: removelast_idx, addi. - Pop the last element from
nums. - If the set for
valis now empty, optionally clean up the key. - Return
true.
For getRandom():
- Return
nums[random index from 0 to len(nums) - 1].
The swap-and-pop trick is the same one from #380. The only added complexity is maintaining sets of indices instead of single indices.
The code
from collections import defaultdict
import random
class RandomizedCollection:
def __init__(self):
self.nums = []
self.idx_map = defaultdict(set)
def insert(self, val):
was_absent = val not in self.idx_map or len(self.idx_map[val]) == 0
self.nums.append(val)
self.idx_map[val].add(len(self.nums) - 1)
return was_absent
def remove(self, val):
if val not in self.idx_map or len(self.idx_map[val]) == 0:
return False
i = self.idx_map[val].pop()
last_idx = len(self.nums) - 1
last = self.nums[last_idx]
if i != last_idx:
self.nums[i] = last
self.idx_map[last].discard(last_idx)
self.idx_map[last].add(i)
self.nums.pop()
return True
def getRandom(self):
return random.choice(self.nums)
A few points worth noting:
defaultdict(set)saves you from manually initializing sets for new keys.- We use
discardinstead ofremoveon the last element's set to avoid errors ifi == last_idxedge cases overlap. - The order of operations in
removematters. You must update the last element's set before popping from the array, and you must handle the case where the removed index and the last index are the same.
Step 0: Initial state
The collection starts empty.
Array
(empty)
HashMap (val → indices)
(empty)
Both the array and hash map are empty. getRandom() would not be called on an empty collection.
Step 1: insert(1)
Value 1 is not present. Append 1 to the array at index 0. Add index 0 to the set for key 1 in the map.
Array
HashMap (val → indices)
- 1 → {0}
Returns: true
First insert of value 1. Returns true because 1 was not in the collection.
Step 2: insert(1)
Value 1 is already present. Append another 1 to the array at index 1. Add index 1 to the existing set for key 1.
Array
HashMap (val → indices)
- 1 → {0, 1}
Returns: false
Duplicate insert of value 1. Returns false because 1 was already in the collection. The set now holds two indices.
Step 3: insert(2)
Value 2 is not present. Append 2 to the array at index 2. Create a new set for key 2 with index 2.
Array
HashMap (val → indices)
- 1 → {0, 1}
- 2 → {2}
Returns: true
First insert of value 2. Returns true. The map now has two keys.
Step 4: remove(1)
Value 1 is present. Pick any index from its set, say index 1. Swap arr[1] with the last element arr[2]. Update the map: move index 2 from key 2's set to reflect its new position (index 1). Remove index 1 from key 1's set. Pop the last element.
Array
HashMap (val → indices)
- 1 → {0}
- 2 → {1}
Returns: true
Swap-and-pop: arr[1] (value 1) swapped with arr[2] (value 2), then pop. Returns true. Key 1 still has one index left.
Step 5: getRandom()
Pick a random index from 0 to len(arr)-1. The array is [1, 2], so each element has 1/2 probability.
Array
HashMap (val → indices)
- 1 → {0}
- 2 → {1}
Returns: 1 or 2 (equal probability)
With array [1, 2], the probability of returning 1 is 1/2 and the probability of returning 2 is 1/2. Uniform over all copies in the array.
Complexity
| Time | Space | |
|---|---|---|
| insert | O(1) average | O(1) amortized |
| remove | O(1) average | O(1) |
| getRandom | O(1) | O(1) |
| Total (n ops) | O(n) average | O(n) |
All three operations are O(1) on average. The hash set operations (add, pop, discard) are O(1) amortized. The array append and pop are O(1) amortized. random.choice on a list is O(1). The overall space is O(n) where n is the total number of elements stored.
Building blocks
The core pattern here is combining an array with a hash map to get O(1) for operations that normally require one or the other. The array gives you random access by index (needed for getRandom). The hash map gives you O(1) lookup by value (needed for remove). Neither alone supports all three operations in constant time.
The swap-and-pop removal trick is the glue. Instead of removing from the middle of the array (which is O(n)), you swap the target with the last element and pop from the end. This preserves O(1) removal at the cost of the array being unordered, which is fine because getRandom does not care about order.
This exact pattern of "array + hash map + swap-and-pop" shows up in many design problems. LRU Cache is another example where combining two data structures (hash map + doubly linked list) gives you O(1) for operations that neither structure handles alone.
Edge cases
- Removing the last element. When
i == last_idx, the swap is a no-op. The code handles this correctly becausediscard(last_idx)on the same set followed byadd(i)would re-add the same index. That is why we skip the swap entirely wheni == last_idx. - Removing the only copy. When the set for a value becomes empty after removal,
getRandomwill never pick that value because it is no longer in the array. You do not strictly need to delete the empty set from the map, but you can to save memory. - All duplicates. If the array is
[5, 5, 5, 5], every call togetRandomreturns 5. Removal works the same: pick any index from the set, swap with last, pop. - Insert then immediately remove. Returns
truefor insert (value not present), thentruefor remove (value was present). The collection returns to its previous state.
From understanding to recall
Think of the array as a bag you can reach into blindfolded (that is getRandom). The hash map is a lookup table that tells you exactly where each value sits in the bag. The sets handle the fact that the same value can sit in multiple positions.
When you need to remove a value, the lookup table gives you a position. You pull that element out by swapping it with the one on top (the last element) and discarding the top. Then you update the lookup table for both values involved in the swap.
If you can picture a bag with labeled positions and a side table tracking which labels sit where, the full implementation follows from that image.
Related posts
- Insert Delete GetRandom O(1) - the simpler version without duplicates
- LRU Cache - combining data structures for O(1) operations
- Contains Duplicate - hash set for duplicate detection