Skip to content
← All posts

Design HashMap: Implement a Hash Map From Scratch

5 min read
leetcodeproblemeasydesignhash-map

Design HashMap is LeetCode problem 706. You need to build a hash map that supports put, get, and remove operations without using any built-in hash table library. The core idea is a bucket array where each bucket holds a chain of key-value pairs. A hash function decides which bucket a key belongs to, and chaining handles collisions when multiple keys land in the same bucket.

The problem

Implement the MyHashMap class with these operations:

  • MyHashMap() initializes an empty map
  • put(key, value) inserts or updates the key-value pair
  • get(key) returns the value mapped to the key, or -1 if the key does not exist
  • remove(key) removes the key and its value if present

Keys and values are integers in the range [0, 10^6]. At most 10^4 calls will be made to put, get, and remove.

hash(key) % 5 maps each key to a bucket index10 % 5 = 02 % 5 = 23 % 5 = 312 % 5 = 2buckets0101001empty22201212033304empty
A HashMap with 5 buckets. Keys 2 and 12 both map to bucket 2, resolved by chaining linked-list nodes.

The approach

You need two things: a fixed-size bucket array and a hash function that maps keys to bucket indices.

Hash function. For integer keys, key % num_buckets works well. Choosing a prime number for the bucket count (like 1009) distributes keys more evenly and reduces clustering.

Chaining. When two keys hash to the same bucket, you store both key-value pairs in a list at that index. On put, you walk the chain to see if the key already exists (update it) or append a new pair. On get, you walk the chain looking for the key. On remove, you find and delete the matching pair.

This gives you O(1) average time per operation as long as the chains stay short. With a reasonable bucket count, collisions are rare enough that each chain has only a handful of entries.

The solution

class MyHashMap:
    def __init__(self):
        self.size = 1009
        self.buckets = [[] for _ in range(self.size)]

    def _hash(self, key):
        return key % self.size

    def put(self, key, value):
        idx = self._hash(key)
        for i, (k, v) in enumerate(self.buckets[idx]):
            if k == key:
                self.buckets[idx][i] = (key, value)
                return
        self.buckets[idx].append((key, value))

    def get(self, key):
        idx = self._hash(key)
        for k, v in self.buckets[idx]:
            if k == key:
                return v
        return -1

    def remove(self, key):
        idx = self._hash(key)
        for i, (k, v) in enumerate(self.buckets[idx]):
            if k == key:
                self.buckets[idx].pop(i)
                return

The __init__ method creates 1009 empty lists. Each list is one bucket. The _hash method returns the bucket index for a given key.

In put, you first check whether the key already exists in the bucket. If it does, you overwrite the value in place. If not, you append a new (key, value) tuple to the end of the chain.

In get, you scan the bucket chain for a matching key. If found, return the value. Otherwise return -1.

In remove, you scan for the key and pop the matching entry from the list. If the key is not present, nothing happens.

Step-by-step walkthrough

Let's trace through a sequence of put, get, and remove calls to see how keys flow through the hash function and land in buckets.

Step 1: put(2, 20)

put(2, 20)2hash2 % 5 = 2[2]storedBucket 2 is empty. Create a new node [2, 20] in that chain.

Step 2: put(3, 30)

put(3, 30)3hash3 % 5 = 3[3]storedBucket 3 is empty. Create a new node [3, 30].

Step 3: put(12, 120)

put(12, 120)12hash12 % 5 = 2[2]chainedBucket 2 already has key 2. Append node [12, 120] to the chain.

Step 4: get(2)

get(2)2hash2 % 5 = 2[2]20Walk bucket 2 chain. First node matches key 2. Return value 20.

Step 5: remove(2)

remove(2)2hash2 % 5 = 2[2]deletedFind key 2 in bucket 2 chain and unlink the node. Key 12 remains.

Step 6: get(2)

get(2)2hash2 % 5 = 2[2]-1Walk bucket 2 chain. Key 2 not found. Return -1.

Notice how put(12, 120) in Step 3 creates a collision with key 2 in bucket 2. Both keys coexist in the same chain. When you later remove(2) in Step 5, only that specific node is unlinked. Key 12 stays untouched in the same bucket.

Complexity analysis

OperationTime (average)Time (worst)Space
putO(1)O(n)O(n)
getO(1)O(n)O(n)
removeO(1)O(n)O(n)

Time: on average each bucket has n/B entries where B is the number of buckets. With B = 1009 and at most 10,000 keys, each chain averages about 10 entries. Scanning 10 entries is effectively constant. The worst case is O(n) if every key hashes to the same bucket, but a reasonable bucket count prevents that in practice.

Space: O(n + B) where n is the number of stored key-value pairs and B is the bucket count.

Building blocks

Modular hashing

The expression key % num_buckets is the simplest hash function for integers. It maps any key to a valid bucket index in the range [0, num_buckets). Choosing a prime for the bucket count reduces patterns in the input from clustering into the same few buckets. You will reuse this modular hashing idea in problems like Design HashSet, LRU Cache, and any problem that needs fast key lookups.

Chaining with linear scan

Each bucket is a small list that you scan linearly to find, update, or delete a key. This "chain of pairs" pattern appears whenever you need to handle collisions in a hash-based structure. The same scan-and-match logic shows up when implementing ordered maps, caches, and adjacency lists for graphs.

Edge cases

  • Overwriting an existing key. Calling put(1, 10) then put(1, 99) should update the value to 99, not create a duplicate entry. The for-loop in put must check for the key before appending.
  • Removing a key that does not exist. remove(999) on an empty map should do nothing. The loop simply finds no match and returns without error.
  • get after remove. After put(5, 50) and remove(5), calling get(5) must return -1. Make sure remove actually deletes the pair rather than just marking it.
  • Key 0. Zero is a valid key. Your hash function handles it correctly since 0 % 1009 = 0, which is a valid bucket index.
  • Large keys. Keys up to 1,000,000 are valid. The modulo operation maps them into the bucket range without overflow issues in Python.

From understanding to recall

The Design HashMap problem tests whether you can implement the mechanics behind a data structure you use every day. The code itself is short, roughly 20 lines. But each piece matters: the prime bucket count, the scan-before-append logic in put, the early return in remove. Miss any one detail and specific test cases will fail.

Reading the solution once gives you the concept. Implementing it from scratch under timed conditions, at increasing intervals, is what builds the muscle memory. Spaced repetition drilling on CodeBricks turns "I know how hash maps work" into "I can write one correctly in five minutes."

Related posts

  • Hash Maps: Fast Key-Value Lookups - Deep dive into the hash map data structure and why O(1) average lookups work
  • Design HashSet - The companion problem where you build a set (keys only, no values) with the same bucket-and-chaining technique
  • Two Sum - The classic problem that uses a hash map for O(n) complement lookups

New to algorithms? Understand why O(1) operations matter with our Big O Notation Guide.