Hash Maps: O(1) Lookup Power
If you have ever written a nested loop to search for a matching value in an array, you have felt the pain that hash maps were built to eliminate. That inner loop, the one scanning through n elements to find a single value, is the bottleneck in a huge number of brute force solutions. Hash maps delete it.
A hash map data structure gives you O(1) average-time lookups. Instead of scanning, you compute an index directly from the key. One operation instead of n. That single capability turns O(n^2) algorithms into O(n) ones, and it shows up in more interview problems than probably any other data structure.
What is a hash map?
A hash map (also called a hash table) is a data structure that stores key-value pairs and lets you look up any value by its key in O(1) average time. It works by using a hash function to convert each key into an array index, then storing the value at that index.
The idea is straightforward: instead of searching through a collection to find what you need, you compute exactly where it should be.
In Python, dictionaries (dict) are hash maps. In JavaScript, plain objects and Map are hash maps. In Java, HashMap. In C++, unordered_map. Every major language gives you one because they are that useful.
How hash functions work
A hash function takes a key (like a string or number) and returns an integer. That integer gets reduced to a valid array index using the modulo operator: index = hash(key) % array_size.
A good hash function has two goals:
- Deterministic: The same key always produces the same hash. If
hash("name")returns 4578 today, it must return 4578 tomorrow. - Uniform distribution: Different keys should spread evenly across the available buckets. If every key hashes to the same index, your hash map degrades into a linked list.
You almost never need to write your own hash function. Languages handle this for you. But understanding what happens under the hood helps you reason about performance and debug weird edge cases.
Here is what happens when you do map["name"] = "Jo":
- The hash function computes
hash("name"), producing some large integer (say, 4578). - The modulo operation reduces it to a bucket index:
4578 % 4 = 2. - The key-value pair
("name", "Jo")gets stored in bucket 2.
When you later read map["name"], the same process runs in reverse: hash the key, find the bucket, return the value. No scanning required.
Operations and time complexity
Every core hash map operation follows the same pattern: hash the key, go to the bucket, do the work.
put("name", "Jo")
get("name")
delete("name")
| Operation | Average | Worst Case | Why |
|---|---|---|---|
| Get (lookup) | O(1) | O(n) | Hash to bucket, return value |
| Put (insert) | O(1) | O(n) | Hash to bucket, store pair |
| Delete | O(1) | O(n) | Hash to bucket, remove pair |
| Contains key | O(1) | O(n) | Hash to bucket, check existence |
| Iteration | O(n) | O(n) | Must visit every entry |
The "average O(1)" is the important part. In practice, with a decent hash function and a reasonable load factor, lookups are constant time. The O(n) worst case happens when every key collides into the same bucket, which brings us to the next topic.
Collision handling
A collision happens when two different keys hash to the same bucket index. With a finite number of buckets and a potentially infinite number of keys, collisions are inevitable. The question is how you deal with them.
There are two main strategies.
Chaining (separate chaining)
Each bucket holds a linked list (or another collection) of all key-value pairs that hashed to that index. When a collision happens, you just append to the list.
To look up a key, you hash it to find the bucket, then walk the list comparing keys until you find a match. If the hash function distributes keys well, most lists are short (length 0 or 1), and lookups stay O(1) on average.
This is what the diagram above shows: keys "name" and "city" both hashed to bucket 2, so they are chained together.
Pros: Simple to implement. Performance degrades gracefully.
Cons: Extra memory for list node pointers. Cache-unfriendly because list nodes can be scattered in memory.
Open addressing (linear probing)
Instead of chaining, you store everything directly in the array. When a collision happens, you probe forward through the array until you find an empty slot.
For example, if bucket 2 is full, you check bucket 3. If that is full too, you check bucket 4. When looking up a key, you follow the same probe sequence until you find the key or hit an empty slot (meaning it does not exist).
Pros: Better cache performance since everything is in one contiguous array. No extra pointer overhead.
Cons: Clustering. Full buckets create runs of occupied slots, which slows down both insertions and lookups. Deletions are trickier (you usually mark slots as "deleted" rather than truly emptying them).
Python's dict uses open addressing. Java's HashMap uses chaining. Both give you O(1) average lookups. For hash table interview questions, you rarely need to specify which strategy you would use. Just know that both exist and why.
Load factor and resizing
The load factor is the ratio of entries to buckets. When it gets too high (Python resizes at about 2/3 full, Java at 0.75), the hash map allocates a bigger array and rehashes every entry. This resize is O(n), but it happens rarely enough that the amortized cost of insertions stays O(1).
This is the same trick dynamic arrays use. You pay a big cost occasionally, but averaged over all operations, it is constant.
Hash sets vs hash maps
A hash set is just a hash map without the values. It stores keys only, and its main operation is membership testing: "is this element in the set?"
In Python, set is a hash set. In Java, HashSet. In C++, unordered_set.
| Use case | Data structure | Example |
|---|---|---|
| Need key-value pairs | Hash map (dict) | Storing word frequencies |
| Just need membership testing | Hash set (set) | Checking for duplicates |
| Need to track indices | Hash map (dict) | Two Sum (store value to index mapping) |
| Need counts | Hash map (Counter) | Frequency counting |
Both give O(1) average-time lookups. The choice is just about whether you need to associate values with your keys.
If you only need to answer "have I seen this before?" then use a set. If you need to answer "have I seen this before, and if so, what was associated with it?" then use a map. Getting this distinction right makes your interview code cleaner and communicates intent.
When to use a hash map vs other structures
Hash maps are the right choice when:
- You need fast lookups by key on unsorted data
- You are turning an O(n^2) brute force into O(n) by remembering what you have seen
- You need to count frequencies of elements
- You need to group elements by some computed property
Consider something else when:
- You need elements in sorted order (use a balanced BST or sorted array)
- The keys are dense integers in a known range (a plain array works and uses less memory)
- You need range queries like "all keys between 10 and 20" (balanced BST or segment tree)
- Memory is very tight (hash maps have overhead from hashing, pointers, and unused buckets)
Hash map interview problems: the essential list
Hash maps appear in more coding interview problems than you might expect. Here are the ones every developer should know, grouped by the technique they use.
Frequency counting
These problems use a hash map to count how many times each element appears, then use those counts to answer the question.
- Valid Anagram - Counts character frequencies in two strings and compares them. The classic "are these the same letters?" check.
- Top K Frequent Elements - Counts element frequencies, then uses bucket sort or a heap to extract the K most common.
- Contains Duplicate - Uses a hash set to check whether any element appears more than once. The simplest hash-based problem.
Complement lookup
These problems store what you have seen so far and check whether the complement you need is already there. The core trick behind Two Sum and many variations.
- Two Sum - For each number, check if
target - numberis in the map. If yes, you have found your pair. If not, store the current number and move on.
Grouping
These problems use a hash map to bucket elements by some shared property, where the key is a canonical form and the value is a list of matching elements.
- Group Anagrams - Sorts each word's characters to create a canonical key, then groups all words with the same key together.
Hash set with clever iteration
Some problems use hash sets not just for membership testing, but as the foundation of a clever traversal strategy.
- Longest Consecutive Sequence - Puts all numbers in a set, then for each sequence start (a number where
n - 1is not in the set), counts how far the sequence extends.
Hash map as auxiliary structure
In these problems, the hash map is not the main data structure but plays a supporting role in a more involved design.
- LRU Cache - Combines a hash map (for O(1) key lookup) with a doubly linked list (for O(1) eviction ordering) to build a cache with O(1) get and put.
The patterns behind these problems
If you want to go deeper into the recurring techniques that connect these problems, our patterns guide covers all three core hash map patterns in detail: frequency counting, complement lookup, and grouping.
- Hash Map Patterns - The three core patterns behind every hash map interview problem, with code examples and walkthrough diagrams.
Wrapping up
The hash map data structure is one of the most important tools in your toolkit. The ability to turn any key into an O(1) lookup is what makes it possible to eliminate nested loops, count frequencies in a single pass, and group elements without sorting.
Under the hood, it is an array with a hash function on top. The hash function converts keys to indices. Collisions are handled by chaining (linked lists in each bucket) or open addressing (probing for the next empty slot). Resizing keeps the load factor low enough that operations stay O(1) on average.
The problems listed above are not random. Each one teaches a specific way to leverage hash maps in interviews. Master them, and you will have covered the vast majority of hash table interview questions you will encounter.