Roman to Integer: Left-to-Right with Subtraction Rule
Roman to Integer is LeetCode 13. Given a string representing a roman numeral, convert it to an integer. The seven symbols are: I=1, V=5, X=10, L=50, C=100, D=500, M=1000.
Most characters simply add their value to the total. The only twist is the subtraction rule: when a smaller value appears immediately before a larger one, you subtract the smaller value instead of adding it. The six subtraction cases are IV=4, IX=9, XL=40, XC=90, CD=400, and CM=900.
For example, "MCMXCIV" breaks down as M (1000) + CM (900) + XC (90) + IV (4) = 1994.
The approach
Scan the string from left to right, one character at a time. At each position, look up the current character's value and compare it to the next character's value:
- If the current value is less than the next value, subtract the current value from the running total. This handles subtractive pairs like CM, XC, and IV.
- Otherwise, add the current value to the running total.
That is the entire algorithm. One pass through the string with a single lookahead comparison at each step.
Why does this work? In a subtractive pair like CM, the C appears before M. When you see C and peek at the next character M, you know C (100) is less than M (1000), so you subtract 100. On the next step, you add 1000. The net effect is +900, which is exactly what CM represents.
Python solution
def roman_to_int(s):
values = {"I": 1, "V": 5, "X": 10, "L": 50,
"C": 100, "D": 500, "M": 1000}
total = 0
for i in range(len(s)):
if i + 1 < len(s) and values[s[i]] < values[s[i + 1]]:
total -= values[s[i]]
else:
total += values[s[i]]
return total
Visual walkthrough
Here is the left-to-right scan traced step by step on the input "MCMXCIV". Each step shows which character is current, which is next, and whether the algorithm adds or subtracts.
Step 1: index 0, char "M"
M (1000) is not less than C (100), so add 1000
Step 2: index 1, char "C"
C (100) is less than M (1000), so subtract 100
Step 3: index 2, char "M"
M (1000) is not less than X (10), so add 1000
Step 4: index 3, char "X"
X (10) is less than C (100), so subtract 10
Step 5: index 4, char "C"
C (100) is not less than I (1), so add 100
Step 6: index 5, char "I"
I (1) is less than V (5), so subtract 1
Step 7: index 6, char "V"
V (5) is the last character, so add 5
Done. All characters processed.
Notice the pattern: positions 1 (C before M), 3 (X before C), and 5 (I before V) all trigger subtraction because the current value is less than the next value. Every other position triggers addition.
Complexity analysis
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | Single pass through the string, constant work per character. |
| Space | O(1) | The value lookup map has exactly 7 entries, which is constant. Only a few integer variables are used. |
The algorithm processes each character exactly once with an O(1) lookup, so the total time is linear in the length of the input string.
Building blocks
Roman to Integer is built from one fundamental brick that shows up across many problems involving sequential decision-making.
Lookahead comparison
The core technique is comparing the current element to the next element to decide what action to take. At each index i, you peek at index i + 1 and branch based on the comparison.
This same lookahead pattern appears in:
- Integer to Roman (LeetCode 12) uses a greedy approach that repeatedly subtracts the largest possible value, which is the inverse of this problem.
- Best Time to Buy and Sell Stock (LeetCode 121) compares today's price to the running minimum to decide whether to update the best profit.
- Longest Common Prefix (LeetCode 14) compares characters at the same position across strings, stopping at the first mismatch.
The lookahead comparison is a tiny building block: look at the current element, peek at the next, and decide add vs. subtract. If you can write this loop from scratch, you unlock Roman to Integer and internalize a pattern that transfers to any problem where the meaning of an element depends on its neighbor.
Edge cases
Single character. "V" is just 5. The loop runs once, there is no next character, so it adds the value. This works because the i + 1 < len(s) check fails, falling through to the add branch.
All additive. "VIII" is 5 + 1 + 1 + 1 = 8. No character is smaller than its successor, so every step adds.
All subtractive pairs. "MCMXCIX" has M (add 1000), then CM (subtract 100, add 1000), then XC (subtract 10, add 100), then IX (subtract 1, add 10). Every pair uses the subtraction rule.
Maximum value. "MMMCMXCIX" is 3999, the largest number representable in standard roman numerals. The algorithm handles it with no special case.
Minimum value. "I" is 1. Single character, single addition.
From understanding to recall
The subtraction rule is easy to understand once you see the pattern: smaller before larger means subtract. But knowing the rule and being able to write the solution from scratch under time pressure are two different things.
The entire solution is a for loop with one if/else inside it. The condition checks whether the current value is less than the next value. That is it. If you drill writing this loop a few times with spaced repetition, it becomes automatic. You will not need to re-derive the logic during an interview. You will just write it.
This problem also reinforces a transferable habit: when the meaning of an element depends on what comes after it, peek ahead. That one-line lookahead check is a brick you will reach for again and again.
Related posts
- Integer to Roman - The inverse problem: convert an integer to a roman numeral string using a greedy subtraction approach
- Longest Common Prefix - Another string scanning problem that uses character-by-character comparison with early termination