Time Based Key-Value Store: Binary Search Meets Hash Maps
Time Based Key-Value Store is LeetCode problem 981, and it is one of the cleanest problems for learning how to combine two fundamental data structures. You need a hash map for fast key lookups and binary search on sorted timestamps for versioned retrieval. If you have solved basic hash map and binary search problems before, this one ties them together in a design context that mirrors real-world versioned storage systems.
The problem
Design a data structure that stores key-value pairs where each value is associated with a timestamp. It supports two operations:
set(key, value, timestamp): stores the key with the given value at the given timestamp.get(key, timestamp): returns the value associated with the key at the largest timestamp that is less than or equal to the given timestamp. If no such timestamp exists, return"".
All timestamps for set calls are strictly increasing. This means each key's timestamps are always in sorted order, which is the critical property that makes this problem efficient.
Example:
set("foo", "bar", 1)
set("foo", "bar2", 4)
get("foo", 1) -> "bar"
get("foo", 3) -> "bar" (timestamp 1 is the largest <= 3)
get("foo", 4) -> "bar2"
get("foo", 5) -> "bar2" (timestamp 4 is the largest <= 5)
The brute force approach
The simplest approach for get is to store all (timestamp, value) pairs in a list and scan backwards from the end until you find a timestamp that does not exceed the target.
class TimeMapBrute:
def __init__(self):
self.store = {}
def set(self, key, value, timestamp):
if key not in self.store:
self.store[key] = []
self.store[key].append((timestamp, value))
def get(self, key, timestamp):
if key not in self.store:
return ""
for ts, val in reversed(self.store[key]):
if ts <= timestamp:
return val
return ""
This works, but get is O(n) in the worst case where n is the number of entries for that key. If you call set a million times for the same key, every get call could scan all million entries. That is too slow when the problem allows up to 2 * 10^7 total operations.
The key insight
The problem guarantees that timestamps for set are strictly increasing. That means each key's list of (timestamp, value) pairs is already sorted by timestamp, with no extra work. Sorted data plus a search problem is the classic setup for binary search. You can find the right timestamp in O(log n) instead of O(n).
You do not need to sort anything. You do not need a balanced BST or a sorted container. Just append to a list on set, and binary search on get. The problem gives you the sorted invariant for free.
Step-by-step walkthrough
Step 1: The data structure - HashMap of key to sorted list
Each key in the hash map points to a list of (timestamp, value) pairs. Since timestamps for set() calls are strictly increasing, the list stays sorted automatically. No extra sorting needed.
Step 2: set() appends to the list
When you call set(key, value, timestamp), look up the key in the hash map. If it does not exist, create an empty list. Then append (timestamp, value) to the end. Because timestamps are strictly increasing, appending maintains sorted order. This is O(1).
Step 3: get() uses binary search on timestamps
When you call get(key, timestamp), look up the key in the hash map. If it is missing, return "". Otherwise, binary search the list for the largest timestamp that does not exceed the target. This is O(log n) where n is the number of entries for that key.
Step 4: Binary search finds the largest timestamp that does not exceed the target
Standard binary search with a twist. When values[mid].timestamp is within range (at or below the target), save the value as a candidate and search right for something closer. When it exceeds the target, search left. The last saved candidate is the answer.
Step 5: No valid timestamp? Return ""
If the target timestamp is smaller than every stored timestamp for that key, binary search finds no candidate. Return an empty string. For example, get("foo", 0) returns "" because no timestamp is 0 or less.
The code
Here is the clean Python solution for LeetCode 981:
class TimeMap:
def __init__(self):
self.store = {}
def set(self, key, value, timestamp):
if key not in self.store:
self.store[key] = []
self.store[key].append((timestamp, value))
def get(self, key, timestamp):
if key not in self.store:
return ""
values = self.store[key]
lo, hi = 0, len(values) - 1
result = ""
while lo <= hi:
mid = (lo + hi) // 2
if values[mid][0] <= timestamp:
result = values[mid][1]
lo = mid + 1
else:
hi = mid - 1
return result
Let's walk through each piece.
self.store = {} is a hash map where each key maps to a list of (timestamp, value) tuples.
set checks if the key exists. If not, it creates an empty list. Then it appends the (timestamp, value) pair. Since timestamps are strictly increasing, the list stays sorted. This is O(1).
get first checks if the key exists. If not, return "". Otherwise, it runs binary search on the list. The variable result tracks the best candidate found so far. When values[mid][0] (the timestamp at index mid) is at or below the target, that entry is a valid candidate, so we save its value and search right for something closer. When it exceeds the target, we search left. After the loop, result holds the value at the largest valid timestamp, or "" if none was found.
Python's bisect module could also work here. bisect.bisect_right(timestamps, target) gives you the insertion point, and the entry just before it (if it exists) is your answer. The manual binary search above is more explicit and interview-friendly, since it shows exactly how the search works.
Complexity analysis
| Metric | Value |
|---|---|
| Time (set) | O(1) per call, just a hash map lookup and list append |
| Time (get) | O(log n) per call, where n is the number of entries for that key |
| Space | O(total entries), every (key, timestamp, value) triple is stored once |
This is optimal. You cannot do better than O(1) for set (you must store the data), and O(log n) for get is the best you can achieve with comparison-based search on sorted data.
The building blocks
This problem combines two reusable patterns that show up across many LeetCode problems.
1. Hash map as a top-level index
The hash map gives you O(1) access to the right key's data. Without it, you would need to scan all keys to find the one you want. This same pattern appears in problems like Design HashMap and LRU Cache, where a hash map provides the fast lookup layer and another data structure handles the details.
if key not in self.store:
self.store[key] = []
self.store[key].append(item)
2. Binary search for bounded lookup
The binary search in get is not looking for an exact match. It is looking for the largest value that does not exceed a bound. This "upper bound" style of binary search appears in problems like Koko Eating Bananas (search for the smallest valid speed) and Find First and Last Position (search for boundaries). The template is the same: track a candidate, decide which half to search, and return the best candidate after the loop.
lo, hi = 0, len(values) - 1
result = default
while lo <= hi:
mid = (lo + hi) // 2
if values[mid] <= target:
result = values[mid]
lo = mid + 1
else:
hi = mid - 1
return result
Edge cases
Before submitting, make sure your solution handles these:
- Key does not exist:
get("missing", 5)should return"". Check for key membership before searching. - Timestamp smaller than all stored timestamps:
get("foo", 0)when the earliest timestamp is 1. Binary search finds no valid candidate, soresultstays as"". - Exact timestamp match:
get("foo", 4)when timestamp 4 exists. The binary search finds it becausevalues[mid][0] <= timestampis true. - Timestamp larger than all stored timestamps:
get("foo", 100)should return the value at the latest timestamp. Binary search will keep moving right and save the last entry as the candidate. - Single entry for a key:
set("x", "y", 1)thenget("x", 1). Binary search with one element.lo == hi == mid == 0, the timestamp matches, result is set, loop ends. - Multiple keys with different histories: each key's list is independent. Binary search only touches the entries for the requested key.
From understanding to recall
The logic is clean. A hash map for keys, a sorted list for timestamps, binary search for lookups. You probably get it right now. But writing it correctly in an interview means getting the binary search details right: using <= to track candidates, moving lo = mid + 1 when the timestamp is valid, and initializing result to "".
These are small details, but they matter. If you set hi = mid instead of hi = mid - 1, you might loop forever. If you forget to update result when the timestamp is valid, you will return "" for queries that should succeed. Spaced repetition drills these details until they are automatic.
LeetCode 981 is an excellent SRS candidate because it combines two patterns (hash map indexing and bounded binary search) in about 15 lines of code. Once you can write it from memory, you have both building blocks ready for harder design problems.
Related posts
- Binary Search: The Foundation - The core binary search template that powers the
getoperation in this problem - Design HashMap - Building a hash map from scratch, the same structure used as the top-level index here
- Koko Eating Bananas - Another binary search variant where you search for a bounded answer rather than an exact match