Design Underground System: Two Hash Maps for Travel Time Tracking
Design Underground System is LeetCode problem 1396. You need to build a class that tracks passengers checking in and out of subway stations, then answers queries about average travel times between any two stations. It is a clean design problem where the challenge is not algorithmic complexity but choosing the right data structures to support all three operations efficiently.
Why this problem matters
Design problems test whether you can translate a real-world scenario into the right data structures. There is no trick here, no clever algorithm. The question is: given three operations (check in, check out, get average), what do you store and how do you organize it so every operation runs in constant time?
This is exactly the kind of thinking you do when building real systems. A ride-sharing app, a logistics tracker, or a monitoring dashboard all face the same core problem: record events as they happen, then answer aggregate queries later. Getting comfortable with this pattern makes you faster in both interviews and production code.
The key insight
Use two hash maps:
check_ins: maps passenger id to (station, time) for currently active tripsroute_data: maps (start, end) route tuple to (total_time, trip_count) for completed trips
When a passenger checks in, store their start station and time in check_ins. When they check out, look up their check-in record, compute the trip duration, and fold it into the running totals in route_data. When someone asks for the average, just divide total time by trip count.
The beauty is that each operation touches exactly one hash map (check in and check out touch check_ins, check out also updates route_data, and get average reads route_data). Nothing requires scanning a list or sorting anything.
The solution
class UndergroundSystem:
def __init__(self):
self.check_ins = {}
self.route_data = {}
def check_in(self, id: int, station_name: str, t: int) -> None:
self.check_ins[id] = (station_name, t)
def check_out(self, id: int, station_name: str, t: int) -> None:
start_station, start_time = self.check_ins.pop(id)
route = (start_station, station_name)
if route in self.route_data:
total, count = self.route_data[route]
self.route_data[route] = (total + t - start_time, count + 1)
else:
self.route_data[route] = (t - start_time, 1)
def get_average_time(self, start_station: str, end_station: str) -> float:
total, count = self.route_data[(start_station, end_station)]
return total / count
Using pop in check_out removes the passenger from check_ins in one step. This keeps the active-trip map lean and avoids stale entries. The problem guarantees each passenger has at most one active trip at a time, so there is no risk of collision.
Visual walkthrough
Let's trace through five operations and watch how the two hash maps evolve. Two passengers both travel from station A to station B, and then we query the average.
Step 1: checkIn(1, "A", 3). Passenger 1 enters station A at time 3.
Store passenger 1 with their start station and check-in time. No completed trips yet.
Step 2: checkIn(2, "A", 8). Passenger 2 enters station A at time 8.
Both passengers are now in transit from station A. Still no completed trips.
Step 3: checkOut(1, "B", 10). Passenger 1 exits at station B at time 10.
Trip time = 10 - 3 = 7. Remove passenger 1 from check_ins. Add route A->B with total 7 and count 1.
Step 4: checkOut(2, "B", 12). Passenger 2 exits at station B at time 12.
Trip time = 12 - 8 = 4. Update route A->B: total = 7 + 4 = 11, count = 1 + 1 = 2.
Step 5: getAverageTime("A", "B"). Query the average travel time from A to B.
Look up route ("A","B") in route_data. Average = 11 / 2 = 5.5.
The final average is 5.5 because the two trips took 7 and 4 time units respectively, and (7 + 4) / 2 = 5.5. Every operation was a single hash map lookup or update.
Complexity analysis
| Operation | Time | Space |
|---|---|---|
| checkIn | O(1) | O(n) |
| checkOut | O(1) | O(n) |
| getAverageTime | O(1) | O(n) |
Time: Every operation is O(1). Check in is a single dictionary insert. Check out is a dictionary pop plus a dictionary update. Get average is a dictionary lookup plus one division.
Space: O(n) overall, where n is the number of passengers and routes. The check_ins map holds at most one entry per active passenger, and route_data holds one entry per unique route pair.
The building blocks
1. Hash map for O(1) record storage
The check_ins map is the simplest possible use of a hash map: store a value by key, retrieve it later, delete it when done. This same pattern appears in Two Sum, Group Anagrams, and any problem that needs fast key-value access.
2. Tuple keys for composite lookups
Using a tuple (start_station, end_station) as a dictionary key is a common Python pattern that many people forget about. It lets you look up data by a pair of values in O(1) without building a nested dictionary. You will see this same technique in problems like grid-based DP or graph edge lookups.
3. Running aggregates instead of storing raw data
Instead of keeping a list of every trip time and computing the average from scratch each time, you store just two numbers per route: the running total and the count. This is the same idea behind online algorithms for mean, variance, and other statistics. It keeps space constant per route and makes the query O(1).
Edge cases
- Single trip on a route. The average is just that one trip's duration. Make sure your division works when count is 1.
- Multiple passengers on the same route. The running total and count correctly accumulate across all passengers.
- Passenger checks in but has not checked out yet. They remain in
check_ins. Queries only reflect completed trips, so this does not affect averages. - Same passenger making multiple trips. The problem guarantees a passenger will not check in again before checking out. After checkout, their entry is removed from
check_ins, so the next check-in creates a fresh entry. - Many different routes. Each unique (start, end) pair gets its own entry in
route_data. The map grows with the number of distinct routes, not the number of total trips.
From understanding to recall
This problem feels easy once you see the two-map structure. That is the danger. You will read the solution, nod along, and then blank when you try to reproduce it next week. The stumbling point is usually the check_out method: remembering to pop the check-in record, compute the duration, and update the route aggregate in the right order.
The details matter. Do you pop or just read from check_ins? Do you store the duration or the absolute times in route_data? Do you initialize the route entry or update it? These small decisions trip you up under pressure. The only fix is to write it out from scratch a few times until the flow becomes automatic.
Related posts
- LRU Cache - Another design problem that pairs hash maps with a second data structure for O(1) operations
- Design HashMap - Build the hash map itself from scratch using chaining or open addressing
- Design Add and Search Words - A design problem using a trie where the data structure choice drives the entire solution