Skip to content
← All posts

Maximum Number of Removable Characters: Binary Search on the Answer

5 min read
leetcodeproblemmediumstringsbinary-search

Some problems look like they need brute force, but the answer space has a hidden structure that makes binary search possible. Maximum Number of Removable Characters is a perfect example. You are not searching through an array for a target. You are searching for the largest number of characters you can remove from a string before a subsequence relationship breaks.

The problem

LeetCode 1898: Maximum Number of Removable Characters. You are given a string s, a string p that is a subsequence of s, and an array removable containing distinct indices of s. You need to find the maximum number k such that p is still a subsequence of s after removing the characters at the first k indices listed in removable.

In other words, you remove the characters at removable[0], removable[1], ..., removable[k-1] from s. After those removals, p must still appear as a subsequence of the remaining characters.

s (k=2, removed indices 3,1)a0×1c2×3c4b5p (subsequence to match)abp is still a subsequence after removing 2 characters
s = "abcacb", p = "ab", removable = [3, 1, 0]. After removing the first k=2 indices (3 and 1), p is still a subsequence. Removing k=3 would also remove index 0, making "a" unavailable.

The key insight: monotonic feasibility

If p is a subsequence of s after removing k characters, then p is also a subsequence after removing fewer than k characters. Removing fewer characters only adds characters back to s, which can never break a subsequence that was already present. On the other hand, if p is not a subsequence after removing k characters, it will not be a subsequence after removing k + 1 either, because you are removing everything from the k case plus one more.

This gives you a monotonic property. There is some boundary value where everything at or below it is feasible, and everything above it is not. That is exactly the structure binary search needs.

Instead of trying every possible k from 0 to len(removable), you binary search for the boundary. At each step, you pick a candidate k, mark those indices as removed, and check whether p is still a subsequence of the remaining characters. If yes, try a larger k. If no, try a smaller one.

Visual walkthrough

Step 1: lo=0, hi=3, mid=2. Remove first 2 indices from removable (indices 3, 1). Is "ab" a subsequence of the remaining string?

k=0k=1k=2k=3lomidhis:a×c×cbp:ab

Yes. "a" matches at position 0, "b" matches at position 5. p is still a subsequence. Try more removals: set lo = mid = 2.

Step 2: lo=2, hi=3, mid=3. Remove first 3 indices from removable (indices 3, 1, 0). Is "ab" a subsequence?

k=0k=1k=2k=3lomidhis:××c×cbp:ab

No. After removing index 0, there is no "a" left in s. Too many removals. Set hi = mid - 1 = 2.

Step 3: lo=2, hi=2. Search complete. The maximum k is 2.

k=0k=1k=2k=3lomidhis:a×c×cbp:ab

lo equals hi. We can remove at most 2 characters and p remains a subsequence of s.

The code

def maximum_removals(s, p, removable):
    lo, hi = 0, len(removable)

    while lo < hi:
        mid = lo + (hi - lo + 1) // 2
        removed = set(removable[:mid])
        filtered = [c for i, c in enumerate(s) if i not in removed]

        j = 0
        for c in filtered:
            if j < len(p) and c == p[j]:
                j += 1

        if j == len(p):
            lo = mid
        else:
            hi = mid - 1

    return lo

The outer loop is a standard binary search for the maximum feasible value. We use the upper-mid formula lo + (hi - lo + 1) // 2 to avoid infinite loops when lo and hi differ by exactly 1. When the subsequence check passes, we try a bigger k by setting lo = mid. When it fails, we reduce the search space with hi = mid - 1.

Inside each iteration, we build a set of the first mid indices from removable, then construct the filtered version of s by skipping those positions. The subsequence check is a simple greedy scan: walk through the filtered characters with a pointer j into p, advancing j whenever we match. If j reaches len(p), every character of p was found in order.

The reason we build a set rather than modifying s in place is efficiency. Checking membership in a set is O(1), so filtering s takes O(n) time. Rebuilding from scratch at each binary search step is cleaner than trying to incrementally add or undo removals.

Complexity analysis

ComplexityWhy
TimeO(n * log m)Binary search runs O(log m) iterations where m = len(removable). Each iteration builds a set in O(m) and scans s in O(n). Since m is at most n, this simplifies to O(n log n).
SpaceO(n)The set of removed indices and the filtered string each use at most O(n) space.

Where n is the length of s and m is the length of removable. The brute force approach of trying every k from 0 to m would be O(n * m), which is O(n^2) in the worst case. Binary search brings the number of iterations down from m to log m.

The building blocks

  1. Binary search on the answer. Instead of searching a sorted array for a value, you search a range of candidate answers [0, len(removable)] for the largest one satisfying a condition. The same template appears in Koko Eating Bananas, Split Array Largest Sum, and many other problems.

  2. Greedy subsequence check. Given two strings, verify whether one is a subsequence of the other by scanning left to right with two pointers. This is the same technique from Is Subsequence. You advance the pointer on the shorter string only when you find a matching character.

  3. Set-based filtering. Using a hash set to mark removed positions gives O(1) lookups, making the filtering step linear instead of quadratic.

Edge cases

  • removable is empty. The answer is 0, and the binary search never enters the loop because lo and hi are both 0.
  • p is a single character. The subsequence check becomes a simple containment test. You can remove characters until the last copy of that character in s is gone.
  • All of s can be removed. If p is empty, it is a subsequence of any string, so the answer is len(removable).
  • p equals s. Every character matters. You can only remove characters whose positions are not needed for the subsequence match, which might be 0.
  • removable covers every index of s. The answer depends on how many removals p can survive before losing a required character.

From understanding to recall

The logic here is two separate skills stitched together: binary search on the answer and the greedy subsequence check. Both are patterns you will see in many other problems. The tricky part is recognizing that "maximum k" has a monotonic structure, which is the signal to reach for binary search instead of brute force.

Try writing the solution from scratch without looking at this page. The common mistakes are using the wrong mid formula (lower-mid instead of upper-mid causes an infinite loop here), forgetting to rebuild the removed set at each step, and getting the subsequence check pointer logic backwards. These are exactly the details that slip under interview pressure.

Spaced repetition locks in both the recognition ("this is binary search on the answer") and the implementation details ("use upper-mid for max-feasible, rebuild the set each time"). After a few reps, you write the solution in under five minutes without second-guessing the boundary conditions.

Related posts