Design HashMap: Implement a Hash Map From Scratch
Design HashMap is LeetCode problem 706. You need to build a hash map that supports put, get, and remove operations without using any built-in hash table library. The core idea is a bucket array where each bucket holds a chain of key-value pairs. A hash function decides which bucket a key belongs to, and chaining handles collisions when multiple keys land in the same bucket.
The problem
Implement the MyHashMap class with these operations:
MyHashMap()initializes an empty mapput(key, value)inserts or updates the key-value pairget(key)returns the value mapped to the key, or -1 if the key does not existremove(key)removes the key and its value if present
Keys and values are integers in the range [0, 10^6]. At most 10^4 calls will be made to put, get, and remove.
The approach
You need two things: a fixed-size bucket array and a hash function that maps keys to bucket indices.
Hash function. For integer keys, key % num_buckets works well. Choosing a prime number for the bucket count (like 1009) distributes keys more evenly and reduces clustering.
Chaining. When two keys hash to the same bucket, you store both key-value pairs in a list at that index. On put, you walk the chain to see if the key already exists (update it) or append a new pair. On get, you walk the chain looking for the key. On remove, you find and delete the matching pair.
This gives you O(1) average time per operation as long as the chains stay short. With a reasonable bucket count, collisions are rare enough that each chain has only a handful of entries.
The solution
class MyHashMap:
def __init__(self):
self.size = 1009
self.buckets = [[] for _ in range(self.size)]
def _hash(self, key):
return key % self.size
def put(self, key, value):
idx = self._hash(key)
for i, (k, v) in enumerate(self.buckets[idx]):
if k == key:
self.buckets[idx][i] = (key, value)
return
self.buckets[idx].append((key, value))
def get(self, key):
idx = self._hash(key)
for k, v in self.buckets[idx]:
if k == key:
return v
return -1
def remove(self, key):
idx = self._hash(key)
for i, (k, v) in enumerate(self.buckets[idx]):
if k == key:
self.buckets[idx].pop(i)
return
The __init__ method creates 1009 empty lists. Each list is one bucket. The _hash method returns the bucket index for a given key.
In put, you first check whether the key already exists in the bucket. If it does, you overwrite the value in place. If not, you append a new (key, value) tuple to the end of the chain.
In get, you scan the bucket chain for a matching key. If found, return the value. Otherwise return -1.
In remove, you scan for the key and pop the matching entry from the list. If the key is not present, nothing happens.
Step-by-step walkthrough
Let's trace through a sequence of put, get, and remove calls to see how keys flow through the hash function and land in buckets.
Step 1: put(2, 20)
Step 2: put(3, 30)
Step 3: put(12, 120)
Step 4: get(2)
Step 5: remove(2)
Step 6: get(2)
Notice how put(12, 120) in Step 3 creates a collision with key 2 in bucket 2. Both keys coexist in the same chain. When you later remove(2) in Step 5, only that specific node is unlinked. Key 12 stays untouched in the same bucket.
Complexity analysis
| Operation | Time (average) | Time (worst) | Space |
|---|---|---|---|
| put | O(1) | O(n) | O(n) |
| get | O(1) | O(n) | O(n) |
| remove | O(1) | O(n) | O(n) |
Time: on average each bucket has n/B entries where B is the number of buckets. With B = 1009 and at most 10,000 keys, each chain averages about 10 entries. Scanning 10 entries is effectively constant. The worst case is O(n) if every key hashes to the same bucket, but a reasonable bucket count prevents that in practice.
Space: O(n + B) where n is the number of stored key-value pairs and B is the bucket count.
Building blocks
Modular hashing
The expression key % num_buckets is the simplest hash function for integers. It maps any key to a valid bucket index in the range [0, num_buckets). Choosing a prime for the bucket count reduces patterns in the input from clustering into the same few buckets. You will reuse this modular hashing idea in problems like Design HashSet, LRU Cache, and any problem that needs fast key lookups.
Chaining with linear scan
Each bucket is a small list that you scan linearly to find, update, or delete a key. This "chain of pairs" pattern appears whenever you need to handle collisions in a hash-based structure. The same scan-and-match logic shows up when implementing ordered maps, caches, and adjacency lists for graphs.
Edge cases
- Overwriting an existing key. Calling
put(1, 10)thenput(1, 99)should update the value to 99, not create a duplicate entry. The for-loop inputmust check for the key before appending. - Removing a key that does not exist.
remove(999)on an empty map should do nothing. The loop simply finds no match and returns without error. - get after remove. After
put(5, 50)andremove(5), callingget(5)must return -1. Make sureremoveactually deletes the pair rather than just marking it. - Key 0. Zero is a valid key. Your hash function handles it correctly since
0 % 1009 = 0, which is a valid bucket index. - Large keys. Keys up to 1,000,000 are valid. The modulo operation maps them into the bucket range without overflow issues in Python.
From understanding to recall
The Design HashMap problem tests whether you can implement the mechanics behind a data structure you use every day. The code itself is short, roughly 20 lines. But each piece matters: the prime bucket count, the scan-before-append logic in put, the early return in remove. Miss any one detail and specific test cases will fail.
Reading the solution once gives you the concept. Implementing it from scratch under timed conditions, at increasing intervals, is what builds the muscle memory. Spaced repetition drilling on CodeBricks turns "I know how hash maps work" into "I can write one correctly in five minutes."
Related posts
- Hash Maps: Fast Key-Value Lookups - Deep dive into the hash map data structure and why O(1) average lookups work
- Design HashSet - The companion problem where you build a set (keys only, no values) with the same bucket-and-chaining technique
- Two Sum - The classic problem that uses a hash map for O(n) complement lookups
New to algorithms? Understand why O(1) operations matter with our Big O Notation Guide.