Skip to content
← All posts

Minimum Deletions to Make String Balanced: Counting B's as You Go

5 min read
leetcodeproblemmediumstringsdynamic-programmingstacks

LeetCode Minimum Deletions to Make String Balanced (Problem 1653) asks you to find the fewest characters to remove from a string of 'a's and 'b's so that no 'a' appears after a 'b'. In other words, the result should have all 'a's on the left and all 'b's on the right.

The problem

You are given a string s containing only the characters 'a' and 'b'. A string is balanced if there is no pair of indices (i, j) where i < j, s[i] == 'b', and s[j] == 'a'. You can delete any number of characters from s. Return the minimum number of deletions to make s balanced.

Examples:

  • "aababbab" returns 2. Delete the two 'a's that appear after 'b's to get "aabbbb".
  • "bbaaaaab" returns 2. Delete the two leading 'b's to get "aaaaab".
  • "aaaaaa" returns 0. Already balanced.
01234567aababbabdeletedeleteResult:aabbbball a's | all b's
For "aababbab", deleting the two highlighted 'a' characters at indices 3 and 6 produces "aabbbb", which is balanced. Minimum deletions = 2.

The brute force approach

The naive idea is to try every possible subset of characters to delete. For each subset, check whether the remaining string is balanced and track the smallest subset that works. Since each character can either be kept or deleted, this produces 2^n subsets. Checking each one for balance takes O(n) time, so the total is O(n * 2^n). That blows up fast and will not pass for strings up to 100,000 characters long.

A slightly better brute force is to try every possible "split point." Imagine drawing a line somewhere in the string. Everything to the left of the line should be 'a', and everything to the right should be 'b'. The cost at each split is: count of 'b's on the left plus count of 'a's on the right. You try all n + 1 split positions and take the minimum. This runs in O(n) per split and O(n) splits, giving O(n^2), which is better but still not ideal. With prefix sums you can bring it to O(n), but there is an even cleaner approach.

The key insight

You do not need to precompute prefix sums or try every split point explicitly. Instead, you can make the decision at each character as you scan left to right, maintaining just two integers:

  • b_count: how many 'b's you have seen so far
  • deletions: the minimum deletions needed to make the string balanced up to the current position

When you see a 'b', just increment b_count. A 'b' at the end of the processed portion never breaks the balance.

When you see an 'a', you have a choice. If there are 'b's before this 'a', the string is no longer balanced. You can either delete this 'a' (cost: deletions + 1) or conceptually delete all the 'b's you have seen (cost: b_count). You pick whichever is cheaper.

The transition deletions = min(deletions + 1, b_count) is the entire algorithm. It is the same "running minimum DP" pattern used in Flip String to Monotone Increasing. Each character either leaves the state unchanged or forces a one-line decision.

Step-by-step walkthrough

Let's trace through s = "aababbab" and watch b_count and deletions update at each position.

Init: b_count = 0, deletions = 0

01234567aababbabb_count = 0del = 0

We scan left to right. For each character, we decide whether the running cost should change.

Step 1 (i=0): 'a' with b_count = 0. No b's seen yet, so this 'a' is fine.

01234567aababbabb_count = 0del = 0

No b's before this 'a', nothing to delete. deletions stays 0.

Step 2 (i=1): 'a' with b_count = 0. Still no b's before it.

01234567aababbabb_count = 0del = 0

Another 'a' with no preceding 'b'. Still balanced so far.

Step 3 (i=2): 'b'. Increment b_count to 1.

01234567aababbabb_count = 1del = 0

A 'b' never causes a problem by itself. We just record that we have seen one.

Step 4 (i=3): 'a' with b_count = 1. This 'a' comes after a 'b'!

01234567aababbabb_count = 1del = 1

Option A: delete this 'a' (deletions + 1 = 1). Option B: delete all preceding b's (b_count = 1). min(1, 1) = 1.

Step 5 (i=4): 'b'. Increment b_count to 2.

01234567aababbabb_count = 2del = 1

Another 'b' seen. b_count goes to 2. deletions unchanged.

Step 6 (i=5): 'b'. Increment b_count to 3.

01234567aababbabb_count = 3del = 1

b_count = 3. Still no new conflict from a 'b'.

Step 7 (i=6): 'a' with b_count = 3. Another 'a' after b's.

01234567aababbabb_count = 3del = 2

Option A: delete this 'a' (1 + 1 = 2). Option B: delete all b's (b_count = 3). min(2, 3) = 2.

Step 8 (i=7): 'b'. Increment b_count to 4. Done!

01234567aababbabb_count = 4del = 2

Final 'b' is fine. Answer: deletions = 2.

After processing all eight characters, deletions = 2. That is the answer.

The code

def minimumDeletions(s: str) -> int:
    b_count = 0
    deletions = 0
    for ch in s:
        if ch == 'b':
            b_count += 1
        else:
            deletions = min(deletions + 1, b_count)
    return deletions

Two variables, one loop, and a single min call per 'a'. Here is what each piece does:

  1. b_count tracks how many 'b's you have seen so far. This represents the cost of "delete all preceding b's" if you ever need it.
  2. deletions is the running answer: the minimum deletions to make everything up to the current index balanced.
  3. When you encounter a 'b', it cannot create a violation (a 'b' followed by more 'b's is fine). You just record it.
  4. When you encounter an 'a', you compare two options: absorb one more deletion for this 'a' (deletions + 1), or reset by deleting all previous 'b's (b_count). Take the cheaper path.

This is the same structure as LeetCode 926 (Flip String to Monotone Increasing), just phrased with 'a'/'b' instead of '0'/'1'. If you have solved one, the other is almost free.

Complexity analysis

ApproachTimeSpace
Brute force (all subsets)O(n * 2^n)O(n)
Split point with prefix sumsO(n)O(n)
DP / countingO(n)O(1)

The counting approach is optimal. You must examine every character at least once, so O(n) time is the lower bound. And O(1) space means no auxiliary arrays; just two integers.

The building blocks

1. Prefix counting

The pattern of maintaining a running count as you scan through a sequence:

count = 0
for item in sequence:
    if condition(item):
        count += 1
    use_count_in_decision()

Here, b_count is the prefix count. The same idea shows up in problems like counting inversions, tracking frequencies, and computing prefix sums. The key property is that you accumulate information left-to-right and use it to make O(1) decisions at each step.

2. Running-state DP

The pattern of maintaining a DP value that updates in constant time:

dp = 0
for item in sequence:
    dp = transition(dp, item, auxiliary)
return dp

In this problem, deletions is the DP state and the transition is min(deletions + 1, b_count) when you see an 'a'. This same skeleton powers Kadane's algorithm (Maximum Subarray), Best Time to Buy and Sell Stock, and Flip String to Monotone Increasing. Recognizing this pattern lets you reduce seemingly complex problems to a single pass with constant space.

Edge cases

  • Already balanced ("aaabbb"): no 'a' ever appears after a 'b', so deletions stays at 0 the entire scan.
  • All one character ("aaaa" or "bbbb"): trivially balanced. The loop runs but never triggers a deletion.
  • Completely reversed ("bbbaaa"): every 'a' comes after every 'b'. The answer is min(count_a, count_b), which the running state naturally converges to.
  • Single character ("a" or "b"): the loop runs once and returns 0. No special-case code needed.

From understanding to recall

The solution is just two variables and a min call inside a loop. It fits in five lines of Python. But under interview pressure, the details matter. Is it deletions + 1 or deletions - 1? Do you increment b_count for 'a' or 'b'? Do you update deletions for every character or only for 'a'?

Those are the kinds of small mistakes that cost you in a timed setting. Reading the solution and understanding why it works is recognition. Writing it correctly from scratch in two minutes is recall. Spaced repetition bridges the gap: you reconstruct the loop from memory at increasing intervals until the transition formula and the variable roles are automatic.

Related posts