Skip to content
← All posts

Number of Atoms: Stack-Based Formula Parsing

4 min read
leetcodeproblemhardstringsstackshash-map

Number of Atoms is LeetCode problem 726. Given a chemical formula as a string, return the count of each atom sorted alphabetically.

The problem

Parse a chemical formula like "K4(ON(SO3)2)2" and return the atom counts as a sorted string: "K4N2O14S4". The formula can contain:

  • Element names (uppercase letter optionally followed by lowercase letters)
  • Numbers after an element or closing parenthesis
  • Parentheses for grouping, which multiply everything inside
formulaMg(OH)2parse + countsorted result: "H2MgO"H: 2Mg: 1O: 1
Mg(OH)2 is parsed into atom counts: H=2, Mg=1, O=1. Output sorted alphabetically: "H2MgO".

The approach: stack of hash maps

Use a stack where each level holds a hash map of atom counts. When you see an opening parenthesis, push a new empty map. When you see a closing parenthesis followed by a number, pop the top map, multiply all its counts, and merge into the map below.

  1. Start with one empty map on the stack
  2. Parse the formula character by character
  3. Element name: read the full name and its count, add to the current (top) map
  4. (: push a new map
  5. ): pop the top map, read the multiplier, multiply all counts, merge into the new top
  6. At the end, the stack has one map with final counts

Visual walkthrough

Let's trace through "Mg(OH)2" which should produce "H2MgO".

Step 1: Initialize

formulaMg(OH)20123456stack (top at top)L0{}

Start with one empty map on the stack. Begin scanning from index 0.

Step 2: Read "Mg" (element, count=1)

formulaMg(OH)20123456stack (top at top)L0{ Mg:1 }

Read uppercase 'M' then lowercase 'g'. No digit follows, so count defaults to 1. Add Mg:1 to the top map.

Step 3: Read "(" (open paren)

formulaMg(OH)20123456stack (top at top)L1{}L0{ Mg:1 }

Opening parenthesis pushes a new empty map onto the stack. The stack now has 2 levels.

Step 4: Read "O" (element, count=1)

formulaMg(OH)20123456stack (top at top)L1{ O:1 }L0{ Mg:1 }

Read element 'O' with no digit after it. Add O:1 to the top map (level 2).

Step 5: Read "H" (element, count=1)

formulaMg(OH)20123456stack (top at top)L1{ O:1, H:1 }L0{ Mg:1 }

Read element 'H' with no digit. Add H:1 to the top map. Level 2 now has {O:1, H:1}.

Step 6: Read ")2" (close paren, multiplier=2)

formulaMg(OH)20123456stack (top at top)L0{ Mg:1, O:2, H:2 }

Pop top map {O:1, H:1}, multiply each count by 2, merge into level below. Stack: [{Mg:1, O:2, H:2}].

Step 7: Done. Sort and format.

formulaMg(OH)20123456stack (top at top)L0{ H:2, Mg:1, O:1 }

Sort keys alphabetically: H, Mg, O. Format: H2 + Mg + O = "H2MgO". Counts of 1 are omitted in output.

The code

def count_of_atoms(formula):
    stack = [{}]
    i = 0
    n = len(formula)

    while i < n:
        if formula[i] == "(":
            stack.append({})
            i += 1
        elif formula[i] == ")":
            i += 1
            j = i
            while i < n and formula[i].isdigit():
                i += 1
            mult = int(formula[j:i]) if j < i else 1
            top = stack.pop()
            for atom, count in top.items():
                stack[-1][atom] = stack[-1].get(atom, 0) + count * mult
        else:
            j = i
            i += 1
            while i < n and formula[i].islower():
                i += 1
            name = formula[j:i]
            j = i
            while i < n and formula[i].isdigit():
                i += 1
            count = int(formula[j:i]) if j < i else 1
            stack[-1][name] = stack[-1].get(name, 0) + count

    result = []
    for atom in sorted(stack[-1]):
        count = stack[-1][atom]
        result.append(atom + (str(count) if count > 1 else ""))
    return "".join(result)

Key details to notice:

  • The stack always has at least one map (the base level)
  • Element names start with an uppercase letter followed by zero or more lowercase letters
  • Numbers default to 1 when absent (both after elements and after closing parentheses)
  • After processing all characters, you sort the atoms alphabetically and format the output

The multiplier after ) applies to everything inside that group. When you pop the top map and merge it down, you multiply each count by the multiplier. This naturally handles nested parentheses because inner groups were already resolved when they were popped.

Complexity analysis

MetricValueReasoning
TimeO(n + k log k)O(n) to parse the formula where n is the string length. O(k log k) to sort the k distinct atoms.
SpaceO(n)The stack can hold up to O(n) maps in deeply nested formulas. Each map holds atom counts.

The building blocks

Stack for nested structures

Pushing on ( and popping on ) is the standard stack pattern for handling nested groupings. This same technique appears in:

Character-by-character parsing

Reading element names (uppercase + lowercase*) and numbers (digit*) is a common parsing primitive. You advance the index i while characters match the expected pattern, then extract the substring.

Edge cases to watch for

  • No parentheses. "H2O" parses directly into {"H": 2, "O": 1}.
  • Nested parentheses. "K4(ON(SO3)2)2" requires multiple levels of stack nesting.
  • No number after element. "Mg" means count is 1.
  • No number after ). "(OH)" means the multiplier is 1.
  • Single atom. "O" returns "O".
  • Large multipliers. "H1000" produces "H1000".

From understanding to recall

Number of Atoms combines string parsing with nested structure handling. The stack of hash maps is the key insight: each parenthesis level gets its own counting space, and closing a group merges counts upward with a multiplier. The parsing itself is methodical: read the type of character (letter, digit, paren) and handle accordingly.

Understanding the approach is one thing. Reproducing the exact parsing logic under pressure, knowing where to advance i, when to extract substrings, and how to handle default counts of 1, requires practice. Spaced repetition builds the muscle memory for these parsing details.

Related posts


Visual Learner? Explore how stack-based algorithms work with our Stack Interactive Visualization.