Insert Delete GetRandom O(1) - Duplicates Allowed

May 18, 20265 min read

leetcodeproblemhardarrayshash-mapdesign

LeetCode 381 asks you to design a collection that supports three operations, all in average O(1) time:

insert(val) - inserts val into the collection. Returns true if val was not already present.
remove(val) - removes one instance of val from the collection. Returns true if val was present.
getRandom() - returns a random element. Each element must have a probability proportional to the number of copies in the collection.

This is the follow-up to LeetCode 380 (no duplicates). The twist is that duplicates are allowed, and getRandom must reflect duplicate counts. If you have three copies of 1 and one copy of 2, getRandom should return 1 with probability 3/4.

The key insight

In the simpler version (#380), a hash map from value to index gives you O(1) lookup and O(1) removal via swap-and-pop. But that breaks with duplicates because a single value can live at multiple array positions.

The fix: change the hash map so it stores each value mapped to a set of indices. When you insert a duplicate, you append to the array and add the new index to that value's set. When you remove, you pick any index from the value's set, swap the element at that index with the last array element, update both sets, and pop from the end.

The array gives you O(1) random access for getRandom. The hash map of sets gives you O(1) existence checks and O(1) index lookups for removals. Together they handle duplicates without losing constant-time guarantees.

Array elementMap entry (value to index set)

After insert(1), insert(1), insert(2): the array holds [1, 1, 2] and the map stores each value with the set of indices where it appears.

The approach

Maintain two data structures side by side:

A list nums that stores every inserted element (duplicates included).
A dict idx_map that maps each value to a set of indices where it appears in nums.

For insert(val):

Check if val is already in idx_map (and its set is non-empty). Save this for the return value.
Append val to nums.
Add the new index (len(nums) - 1) to idx_map[val].
Return true if val was not already present, false otherwise.

For remove(val):

If val is not in idx_map or its set is empty, return false.
Pop any index i from idx_map[val].
Let last be the last element in nums and last_idx be len(nums) - 1.
If i is not last_idx, swap nums[i] with nums[last_idx]. Update idx_map[last]: remove last_idx, add i.
Pop the last element from nums.
If the set for val is now empty, optionally clean up the key.
Return true.

For getRandom():

Return nums[random index from 0 to len(nums) - 1].

The swap-and-pop trick is the same one from #380. The only added complexity is maintaining sets of indices instead of single indices.

The code

from collections import defaultdict
import random

class RandomizedCollection:
    def __init__(self):
        self.nums = []
        self.idx_map = defaultdict(set)

    def insert(self, val):
        was_absent = val not in self.idx_map or len(self.idx_map[val]) == 0
        self.nums.append(val)
        self.idx_map[val].add(len(self.nums) - 1)
        return was_absent

    def remove(self, val):
        if val not in self.idx_map or len(self.idx_map[val]) == 0:
            return False
        i = self.idx_map[val].pop()
        last_idx = len(self.nums) - 1
        last = self.nums[last_idx]
        if i != last_idx:
            self.nums[i] = last
            self.idx_map[last].discard(last_idx)
            self.idx_map[last].add(i)
        self.nums.pop()
        return True

    def getRandom(self):
        return random.choice(self.nums)

A few points worth noting:

defaultdict(set) saves you from manually initializing sets for new keys.
We use discard instead of remove on the last element's set to avoid errors if i == last_idx edge cases overlap.
The order of operations in remove matters. You must update the last element's set before popping from the array, and you must handle the case where the removed index and the last index are the same.

Practice the swap-and-pop pattern

Step 0: Initial state

The collection starts empty.

Array

(empty)

HashMap (val → indices)

(empty)

Both the array and hash map are empty. getRandom() would not be called on an empty collection.

Step 1: insert(1)

Value 1 is not present. Append 1 to the array at index 0. Add index 0 to the set for key 1 in the map.

Array

HashMap (val → indices)

1 → {0}

Returns: true

First insert of value 1. Returns true because 1 was not in the collection.

Step 2: insert(1)

Value 1 is already present. Append another 1 to the array at index 1. Add index 1 to the existing set for key 1.

Array

HashMap (val → indices)

1 → {0, 1}

Returns: false

Duplicate insert of value 1. Returns false because 1 was already in the collection. The set now holds two indices.

Step 3: insert(2)

Value 2 is not present. Append 2 to the array at index 2. Create a new set for key 2 with index 2.

Array

HashMap (val → indices)

1 → {0, 1}
2 → {2}

Returns: true

First insert of value 2. Returns true. The map now has two keys.

Step 4: remove(1)

Value 1 is present. Pick any index from its set, say index 1. Swap arr[1] with the last element arr[2]. Update the map: move index 2 from key 2's set to reflect its new position (index 1). Remove index 1 from key 1's set. Pop the last element.

Array

HashMap (val → indices)

1 → {0}
2 → {1}

Returns: true

Swap-and-pop: arr[1] (value 1) swapped with arr[2] (value 2), then pop. Returns true. Key 1 still has one index left.

Step 5: getRandom()

Pick a random index from 0 to len(arr)-1. The array is [1, 2], so each element has 1/2 probability.

Array

HashMap (val → indices)

1 → {0}
2 → {1}

Returns: 1 or 2 (equal probability)

With array [1, 2], the probability of returning 1 is 1/2 and the probability of returning 2 is 1/2. Uniform over all copies in the array.

Drill this from memory

Complexity

	Time	Space
insert	O(1) average	O(1) amortized
remove	O(1) average	O(1)
getRandom	O(1)	O(1)
Total (n ops)	O(n) average	O(n)

All three operations are O(1) on average. The hash set operations (add, pop, discard) are O(1) amortized. The array append and pop are O(1) amortized. random.choice on a list is O(1). The overall space is O(n) where n is the total number of elements stored.

Building blocks

The core pattern here is combining an array with a hash map to get O(1) for operations that normally require one or the other. The array gives you random access by index (needed for getRandom). The hash map gives you O(1) lookup by value (needed for remove). Neither alone supports all three operations in constant time.

The swap-and-pop removal trick is the glue. Instead of removing from the middle of the array (which is O(n)), you swap the target with the last element and pop from the end. This preserves O(1) removal at the cost of the array being unordered, which is fine because getRandom does not care about order.

This exact pattern of "array + hash map + swap-and-pop" shows up in many design problems. LRU Cache is another example where combining two data structures (hash map + doubly linked list) gives you O(1) for operations that neither structure handles alone.

Edge cases

Removing the last element. When i == last_idx, the swap is a no-op. The code handles this correctly because discard(last_idx) on the same set followed by add(i) would re-add the same index. That is why we skip the swap entirely when i == last_idx.
Removing the only copy. When the set for a value becomes empty after removal, getRandom will never pick that value because it is no longer in the array. You do not strictly need to delete the empty set from the map, but you can to save memory.
All duplicates. If the array is [5, 5, 5, 5], every call to getRandom returns 5. Removal works the same: pick any index from the set, swap with last, pop.
Insert then immediately remove. Returns true for insert (value not present), then true for remove (value was present). The collection returns to its previous state.

From understanding to recall

Think of the array as a bag you can reach into blindfolded (that is getRandom). The hash map is a lookup table that tells you exactly where each value sits in the bag. The sets handle the fact that the same value can sit in multiple positions.

When you need to remove a value, the lookup table gives you a position. You pull that element out by swapping it with the one on top (the last element) and discarding the top. Then you update the lookup table for both values involved in the swap.

If you can picture a bag with labeled positions and a side table tracking which labels sit where, the full implementation follows from that image.

Insert Delete GetRandom O(1) - the simpler version without duplicates
LRU Cache - combining data structures for O(1) operations
Contains Duplicate - hash set for duplicate detection

Try CodeBricks free

The key insight

The approach

The code

Complexity

Building blocks

Edge cases

From understanding to recall

Related posts