Number of Atoms: Stack-Based Formula Parsing
Number of Atoms is LeetCode problem 726. Given a chemical formula as a string, return the count of each atom sorted alphabetically.
The problem
Parse a chemical formula like "K4(ON(SO3)2)2" and return the atom counts as a sorted string: "K4N2O14S4". The formula can contain:
- Element names (uppercase letter optionally followed by lowercase letters)
- Numbers after an element or closing parenthesis
- Parentheses for grouping, which multiply everything inside
The approach: stack of hash maps
Use a stack where each level holds a hash map of atom counts. When you see an opening parenthesis, push a new empty map. When you see a closing parenthesis followed by a number, pop the top map, multiply all its counts, and merge into the map below.
- Start with one empty map on the stack
- Parse the formula character by character
- Element name: read the full name and its count, add to the current (top) map
(: push a new map): pop the top map, read the multiplier, multiply all counts, merge into the new top- At the end, the stack has one map with final counts
Visual walkthrough
Let's trace through "Mg(OH)2" which should produce "H2MgO".
Step 1: Initialize
Start with one empty map on the stack. Begin scanning from index 0.
Step 2: Read "Mg" (element, count=1)
Read uppercase 'M' then lowercase 'g'. No digit follows, so count defaults to 1. Add Mg:1 to the top map.
Step 3: Read "(" (open paren)
Opening parenthesis pushes a new empty map onto the stack. The stack now has 2 levels.
Step 4: Read "O" (element, count=1)
Read element 'O' with no digit after it. Add O:1 to the top map (level 2).
Step 5: Read "H" (element, count=1)
Read element 'H' with no digit. Add H:1 to the top map. Level 2 now has {O:1, H:1}.
Step 6: Read ")2" (close paren, multiplier=2)
Pop top map {O:1, H:1}, multiply each count by 2, merge into level below. Stack: [{Mg:1, O:2, H:2}].
Step 7: Done. Sort and format.
Sort keys alphabetically: H, Mg, O. Format: H2 + Mg + O = "H2MgO". Counts of 1 are omitted in output.
The code
def count_of_atoms(formula):
stack = [{}]
i = 0
n = len(formula)
while i < n:
if formula[i] == "(":
stack.append({})
i += 1
elif formula[i] == ")":
i += 1
j = i
while i < n and formula[i].isdigit():
i += 1
mult = int(formula[j:i]) if j < i else 1
top = stack.pop()
for atom, count in top.items():
stack[-1][atom] = stack[-1].get(atom, 0) + count * mult
else:
j = i
i += 1
while i < n and formula[i].islower():
i += 1
name = formula[j:i]
j = i
while i < n and formula[i].isdigit():
i += 1
count = int(formula[j:i]) if j < i else 1
stack[-1][name] = stack[-1].get(name, 0) + count
result = []
for atom in sorted(stack[-1]):
count = stack[-1][atom]
result.append(atom + (str(count) if count > 1 else ""))
return "".join(result)
Key details to notice:
- The stack always has at least one map (the base level)
- Element names start with an uppercase letter followed by zero or more lowercase letters
- Numbers default to 1 when absent (both after elements and after closing parentheses)
- After processing all characters, you sort the atoms alphabetically and format the output
The multiplier after ) applies to everything inside that group. When you pop the top map and merge it down, you multiply each count by the multiplier. This naturally handles nested parentheses because inner groups were already resolved when they were popped.
Complexity analysis
| Metric | Value | Reasoning |
|---|---|---|
| Time | O(n + k log k) | O(n) to parse the formula where n is the string length. O(k log k) to sort the k distinct atoms. |
| Space | O(n) | The stack can hold up to O(n) maps in deeply nested formulas. Each map holds atom counts. |
The building blocks
Stack for nested structures
Pushing on ( and popping on ) is the standard stack pattern for handling nested groupings. This same technique appears in:
- Decode String (LeetCode 394) for nested bracket expansion
- Basic Calculator (LeetCode 224) for nested arithmetic expressions
- Evaluate Reverse Polish Notation (LeetCode 150) for stack-based expression evaluation
Character-by-character parsing
Reading element names (uppercase + lowercase*) and numbers (digit*) is a common parsing primitive. You advance the index i while characters match the expected pattern, then extract the substring.
Edge cases to watch for
- No parentheses.
"H2O"parses directly into{"H": 2, "O": 1}. - Nested parentheses.
"K4(ON(SO3)2)2"requires multiple levels of stack nesting. - No number after element.
"Mg"means count is 1. - No number after
)."(OH)"means the multiplier is 1. - Single atom.
"O"returns"O". - Large multipliers.
"H1000"produces"H1000".
From understanding to recall
Number of Atoms combines string parsing with nested structure handling. The stack of hash maps is the key insight: each parenthesis level gets its own counting space, and closing a group merges counts upward with a multiplier. The parsing itself is methodical: read the type of character (letter, digit, paren) and handle accordingly.
Understanding the approach is one thing. Reproducing the exact parsing logic under pressure, knowing where to advance i, when to extract substrings, and how to handle default counts of 1, requires practice. Spaced repetition builds the muscle memory for these parsing details.
Related posts
- Decode String - Stack-based parsing for nested bracket expansion, LeetCode 394
- Basic Calculator - Parsing arithmetic with parentheses using a stack, LeetCode 224
- Evaluate Reverse Polish Notation - Stack-based expression evaluation, LeetCode 150
Visual Learner? Explore how stack-based algorithms work with our Stack Interactive Visualization.