Remove Duplicate Letters: Greedy Monotonic Stack
316. Remove Duplicate Letters gives you a string s and asks you to remove duplicate letters so that every letter appears exactly once, and the result is the lexicographically smallest string among all valid results. You cannot reorder the remaining letters; you can only delete some of them.
For example: "bcabc" becomes "abc", and "cbacdcbc" becomes "acdb". The second example is trickier than it looks — "abcd" seems smaller, but you cannot achieve it because after choosing 'a' and 'b' early, the only 'd' appears before the only remaining 'c', so you are forced into "acdb".
The approach: greedy monotonic stack
The key insight is this: when you encounter a character that is smaller than the top of your current result stack, you should pop the top — but only if that top character appears again later in the string. If it does not appear later, you can never recover it, so you must keep it.
That gives you two conditions that must both be true before you pop:
- The top of the stack is greater than the current character.
- The top character still appears later in the string (it will get another chance).
If either condition fails, you stop popping and push the current character instead. You also skip any character that is already in your result stack, because adding it again would create a duplicate.
Before iterating, build a last_occurrence map that stores the last index where each character appears. This lets you check condition 2 in O(1).
Here is the full algorithm:
- Build
last = {c: last index of c in s}for all characters. - Maintain a
stack(your result so far) and aseenset for O(1) membership checks. - For each character
cat indexi:- If
cis already inseen, skip it. - Otherwise, while the stack is non-empty and the top is greater than
candlast[top] > i, pop the top and remove it fromseen. - Push
cand add it toseen.
- If
- Join the stack into a string.
def removeDuplicateLetters(s: str) -> str:
last = {c: i for i, c in enumerate(s)}
stack = []
seen = set()
for i, c in enumerate(s):
if c in seen:
continue
while stack and stack[-1] > c and last[stack[-1]] > i:
seen.discard(stack.pop())
stack.append(c)
seen.add(c)
return "".join(stack)
i=0, c='b': stack is empty, push 'b'
stack (bottom to top)
last_occurrence
Stack is empty so there is nothing to compare against. Push 'b' directly.
i=1, c='c': 'c' > 'b' so no pop. push 'c'
stack (bottom to top)
last_occurrence
'c' is greater than 'b', so removing 'b' now would not improve lexicographic order. Push 'c'.
i=2, c='a': top='c'>'a' AND last['c']=4>2, pop 'c'. top='b'>'a' AND last['b']=3>2, pop 'b'. push 'a'
stack (bottom to top)
last_occurrence
'a' is smaller than both 'c' and 'b', and both letters reappear later, so we pop them for a better result.
i=3, c='b': 'b' not in stack yet, top='a'<'b' so no pop. push 'b'
stack (bottom to top)
last_occurrence
Top is 'a' which is less than 'b', so we cannot pop. Push 'b'.
i=4, c='c': 'c' not in stack yet, top='b'<'c' so no pop. push 'c'
stack (bottom to top)
last_occurrence
Top is 'b' which is less than 'c'. Push 'c'. Stack is now the final answer: ['a','b','c'].
Complexity
| Time | Space | |
|---|---|---|
| Greedy monotonic stack | O(n) | O(1) |
Time is O(n) because each character is pushed onto the stack at most once and popped at most once, so the total number of push and pop operations across the entire loop is bounded by 2n. The last map and the seen set both hold at most 26 entries (the lowercase alphabet), so space is O(1) regardless of input length.
The building blocks
Monotonic stack pattern. The stack maintains a specific order invariant. Here, the invariant is that the stack stays in increasing lexicographic order as long as popping is allowed. This is the same mechanical idea as Daily Temperatures, where you pop indices off the stack when you find a warmer day. The difference is that Daily Temperatures uses a strictly decreasing stack on temperatures, while this problem uses a strictly increasing stack on characters.
Greedy invariant. The reason you can pop is that replacing a larger character with a smaller one always produces a lexicographically better result at that position, and the greedy choice is locally and globally optimal here. The condition last[top] > i is what makes the greedy safe: it guarantees you will still be able to include that character in the result later.
Seen set for deduplication. The seen set is what prevents you from adding the same character twice. Without it, you would push duplicate characters and violate the problem constraint. It mirrors the same role that a visited set plays in graph traversal.
Edge cases
- Single character: The loop runs once, pushes the character, and returns it immediately.
- All same character (e.g.,
"aaaa"): Every character after the first is already inseen, so they are all skipped. You return just"a". - Already sorted (e.g.,
"abc"): No character is ever greater than the one that follows it, so no pops occur. The stack fills up in order and the output equals the first occurrence of each letter. - Reverse sorted, each appearing once (e.g.,
"cba"): No character reappears, solast[top] > iis never true. No pops occur. The output is the full input"cba"— you cannot improve on it.
From understanding to recall
The logic is easy to follow when you trace through it step by step. What is easy to forget under pressure is the exact pair of conditions that must hold simultaneously before you pop.
The two conditions are:
stack[-1] > c— the top is lexicographically larger than the current character.last[stack[-1]] > i— the top will appear again later (so we are not losing it forever).
Miss the first condition and you destroy the increasing order of your result. Miss the second condition and you silently drop a character that never comes back, making the output invalid.
Practicing this problem with spaced repetition trains you to write both conditions automatically, without having to reconstruct the reasoning from scratch each time.
Related posts
- Daily Temperatures - The canonical monotonic stack problem; same pop-when-better mechanics applied to temperatures
- Largest Rectangle in Histogram - Another monotonic stack problem where the condition to pop is crossing a boundary height
- Stack Patterns - A guide to recognizing when a monotonic stack is the right tool