LFU Cache: Frequency-Based Eviction in O(1)
LFU Cache is LeetCode problem 460. If you have solved LRU Cache (problem 146), you already know the trick of pairing a hash map with a doubly linked list. LFU Cache takes that idea further. Instead of evicting the least recently used item, you evict the least frequently used item. And when two items have the same frequency, you break the tie by evicting the least recently used one among them.
That extra layer of complexity turns a medium problem into a hard one. You need to track how many times each key has been accessed, group keys by frequency, and still keep everything O(1). It is one of the best data structure design problems out there.
The problem
Design a data structure that follows the Least Frequently Used (LFU) eviction policy. It should support two operations:
get(key)- Return the value if the key exists, otherwise return -1. This counts as one use of the key.put(key, value)- Insert or update the key-value pair. If inserting causes the cache to exceed its capacity, evict the least frequently used key. If there is a tie in frequency, evict the least recently used key among the tied keys.
Both operations must run in O(1) average time.
Why this problem matters
LFU caches show up in real systems. CDNs, database buffer pools, and operating system page caches all use frequency-based eviction (or variants of it). The idea is that items accessed many times are more valuable than items accessed once, so you keep the popular items around longer.
The interview version forces you to think carefully about data structure composition. You cannot just bolt a counter onto an LRU cache and call it done. The O(1) constraint means you need a way to find and remove the minimum-frequency item instantly, bump items between frequency groups instantly, and track the current minimum frequency without scanning.
The key insight
You need two hash maps and a collection of doubly linked lists, one per frequency:
key_to_nodemaps each key to its node. This gives O(1) lookup for get and put.freq_to_dllmaps each frequency count to a doubly linked list of all nodes with that frequency. Within each list, nodes are ordered by recency (the tail end is most recent).min_freqis a single integer tracking the smallest frequency currently in the cache. When you need to evict, you go straight tofreq_to_dll[min_freq]and remove the head of that list (the least recently used item at the lowest frequency).
When you access a key (get or put on an existing key), you remove the node from its current frequency list, increment its frequency, and add it to the list for the new frequency. If the old frequency list becomes empty and it was the min_freq, you bump min_freq up by one.
When you insert a brand new key, its frequency starts at 1, so min_freq resets to 1.
The code
class Node:
def __init__(self, key=0, val=0):
self.key = key
self.val = val
self.freq = 1
self.prev = None
self.next = None
class DLL:
def __init__(self):
self.head = Node()
self.tail = Node()
self.head.next = self.tail
self.tail.prev = self.head
self.size = 0
def add(self, node):
prev = self.tail.prev
prev.next = node
node.prev = prev
node.next = self.tail
self.tail.prev = node
self.size += 1
def remove(self, node):
node.prev.next = node.next
node.next.prev = node.prev
self.size -= 1
def pop_lru(self):
node = self.head.next
self.remove(node)
return node
class LFUCache:
def __init__(self, capacity):
self.cap = capacity
self.min_freq = 0
self.key_to_node = {}
self.freq_to_dll = {}
def _update(self, node):
freq = node.freq
self.freq_to_dll[freq].remove(node)
if self.freq_to_dll[freq].size == 0:
del self.freq_to_dll[freq]
if self.min_freq == freq:
self.min_freq += 1
node.freq += 1
if node.freq not in self.freq_to_dll:
self.freq_to_dll[node.freq] = DLL()
self.freq_to_dll[node.freq].add(node)
def get(self, key):
if key not in self.key_to_node:
return -1
node = self.key_to_node[key]
self._update(node)
return node.val
def put(self, key, value):
if self.cap == 0:
return
if key in self.key_to_node:
node = self.key_to_node[key]
node.val = value
self._update(node)
return
if len(self.key_to_node) >= self.cap:
lru = self.freq_to_dll[self.min_freq].pop_lru()
del self.key_to_node[lru.key]
node = Node(key, value)
self.key_to_node[key] = node
self.min_freq = 1
if 1 not in self.freq_to_dll:
self.freq_to_dll[1] = DLL()
self.freq_to_dll[1].add(node)
Why this works
Every piece of the O(1) puzzle has a dedicated structure handling it:
- Finding a key:
key_to_node[key]is a hash map lookup. O(1). - Removing a node from its frequency list: The node has
prevandnextpointers. Unlinking is O(1). - Adding a node to a new frequency list: Insert before the tail sentinel of the target DLL. O(1).
- Finding the eviction candidate:
freq_to_dll[min_freq]gives you the right list. The head of that list is the LRU item at that frequency. O(1). - Tracking min_freq: When you bump a node's frequency from
ftof+1, check if the old list atfis now empty. If it is andfwasmin_freq, incrementmin_freq. When you insert a brand new key,min_freqresets to 1. No scanning required.
The _update helper does the heavy lifting. It handles the remove-from-old-list, potentially-adjust-min_freq, create-new-list-if-needed, add-to-new-list dance. Both get and put (for existing keys) call it, so the logic is not duplicated.
Visual walkthrough
Let's trace through a cache with capacity 2. Pay attention to how nodes move between frequency buckets and how min_freq changes.
Step 1: put(1, A)
New key 1 inserted with freq=1. It goes into the freq=1 bucket. min_freq is set to 1.
Step 2: put(2, B)
New key 2 inserted with freq=1. Both keys sit in the freq=1 bucket. Key 1 is the LRU within this frequency group (inserted earlier).
Step 3: get(1) returns A
Returns A
Accessing key 1 bumps its freq from 1 to 2. It moves out of the freq=1 bucket into a new freq=2 bucket. min_freq stays 1 because key 2 is still there.
Step 4: put(3, C) -- cache full, evict LFU
Cache is full (capacity=2). The LFU item is key 2 (freq=1, and it is the LRU within that frequency). Evict key 2, insert key 3 with freq=1. min_freq resets to 1.
Step 5: get(2) -- key was evicted
Returns -1
Key 2 was evicted in the previous step. It is not in the hash map, so we return -1. The cache is unchanged.
The critical moment is Step 4. The cache is full and we need to evict. We look at freq_to_dll[min_freq], which is freq_to_dll[1]. That list contains key 2 (it was never accessed after insertion, so its frequency stayed at 1). Key 1 was bumped to freq=2 by the get call, so it survives.
Complexity analysis
| Operation | Time | Space |
|---|---|---|
| get | O(1) | O(capacity) total |
| put | O(1) | O(capacity) total |
Time: Every operation does a constant number of hash map lookups, linked list insertions, and linked list removals. All O(1).
Space: The key_to_node map stores at most capacity entries. The freq_to_dll map stores at most capacity linked lists (in the worst case, every key has a unique frequency). Each node appears in exactly one list. Total space is O(capacity).
Building blocks
Hash map plus doubly linked list. This is the same pairing you see in LRU Cache. The hash map gives O(1) key access. The doubly linked list gives O(1) ordered insertion and removal. What makes LFU harder is that you need multiple linked lists, one per frequency, instead of a single one.
LRU within each frequency group. Within a single frequency bucket, ties are broken by recency. The node closest to the head of the list was added to that frequency group earliest, making it the LRU within that group. This is exactly the same ordering logic as a plain LRU cache, just applied independently at each frequency level.
min_freq tracking. This integer is what makes eviction O(1). Without it, you would have to scan all frequency buckets to find the lowest non-empty one. The key insight is that min_freq can only change in two ways: it increments by 1 when the last node leaves the current min_freq bucket, or it resets to 1 when a brand new key is inserted. It never jumps by more than 1 or decreases to an arbitrary value.
Edge cases
- Capacity of 0. The problem allows capacity=0. Every put should be a no-op and every get should return -1. Guard against this at the top of put.
- Updating an existing key. When you put a key that already exists, you update the value and bump the frequency. You do not insert a second node or evict anything.
- All keys at the same frequency. When every key has the same frequency, eviction picks the LRU among them. This is why each frequency list must maintain insertion order.
- Frequency grows without bound. There is no cap on frequency. A key accessed 1000 times has freq=1001. The
freq_to_dllmap can have many entries, but at mostcapacityof them are non-empty at any time, and the total number of nodes across all lists is at mostcapacity.
From understanding to recall
LFU Cache has even more moving parts than LRU Cache. Two hash maps, a DLL class, a Node class with a frequency counter, and the min_freq variable that requires careful reasoning about when it changes. You will understand all of it on first read. Reproducing it from memory a week later is a different challenge entirely.
The failure mode is predictable: you remember "multiple linked lists by frequency" but forget the details. How does min_freq update? Do you create new frequency lists lazily or eagerly? What happens to the old list when the last node leaves? Where do new keys go?
The only reliable fix is repetition. Write the full solution from scratch, verify it, then do it again after a gap. After a few rounds, the _update helper and the min_freq logic become muscle memory.
Related posts
- LRU Cache - The simpler recency-based eviction cache
- Design Twitter - Another system design problem using hash maps
- All O'one Data Structure - Another frequency tracking data structure