Number of Good Ways to Split a String: Prefix-Suffix Distinct Counting

May 31, 20265 min read

leetcodeproblemmediumstringshash-map

LeetCode 1525, Number of Good Ways to Split a String, asks you to count how many ways you can split a string s into two non-empty parts where both parts have the same number of distinct characters.

For example, given s = "aacaba", there are 2 good splits: "aac" | "aba" (2 distinct on each side) and "aaca" | "ba" (2 distinct on each side).

This problem is a clean example of the prefix-suffix precomputation pattern. Instead of recounting distinct characters from scratch at every split point, you precompute the answer for all prefixes and all suffixes in two linear passes. Then a single scan across split positions gives you the answer.

Splitting "aacaba" after index 2 gives left = "aac" (2 distinct) and right = "aba" (2 distinct). Counts match, so this is a good split.

Why this problem matters

The prefix-suffix pattern shows up in many string and array problems. You have already seen it in Product of Array Except Self where you build prefix products and suffix products. Here, the same idea applies to distinct character counts. Learning to spot when a problem decomposes into "what happened on the left" and "what happened on the right" is one of the most useful instincts you can develop.

The key question to ask yourself: "Can I precompute some property for every prefix and every suffix, then combine them at each split point?" If yes, you almost always get an O(n) solution.

The approach

A brute force solution would try every split position and count distinct characters on each side using a set. That gives you O(n^2) because for each of the n-1 split points, you scan up to n characters.

The insight is that you do not need to rebuild the set from scratch at each split. Instead, build two arrays:

left_distinct[i] = number of distinct characters in s[0..i] (inclusive)
right_distinct[i] = number of distinct characters in s[i..n-1] (inclusive)

You build left_distinct with a single left-to-right pass, tracking a running set. You build right_distinct with a single right-to-left pass. Then, for each split position i (splitting after index i, so left = s[0..i] and right = s[i+1..n-1]), check if left_distinct[i] == right_distinct[i+1].

This is the same prefix-suffix decomposition pattern used in Product of Array Except Self. There, you precompute prefix and suffix products. Here, you precompute prefix and suffix distinct counts. The structure is identical: two passes to precompute, one pass to combine.

The solution

def numSplits(s: str) -> int:
    n = len(s)
    left_distinct = [0] * n
    right_distinct = [0] * n

    seen = set()
    for i in range(n):
        seen.add(s[i])
        left_distinct[i] = len(seen)

    seen = set()
    for i in range(n - 1, -1, -1):
        seen.add(s[i])
        right_distinct[i] = len(seen)

    count = 0
    for i in range(n - 1):
        if left_distinct[i] == right_distinct[i + 1]:
            count += 1

    return count

Let's walk through the logic:

Build left_distinct by scanning left to right. At each index, add the character to a set and record the set's size. This gives the cumulative distinct count for the prefix ending at that index.
Build right_distinct by scanning right to left. Same idea, but now you are building the distinct count for the suffix starting at each index.
For every valid split position (after index i, where i goes from 0 to n-2), compare left_distinct[i] with right_distinct[i+1]. If they are equal, it is a good split.

The final loop runs from 0 to n-2 because both sides must be non-empty: the left side needs at least one character (index 0) and the right side needs at least one character (index n-1).

Practice this pattern

Visual walkthrough

Let's trace through s = "aacaba" at every split position and see which ones produce matching distinct counts.

Step 1: Split after index 0. Left = "a", Right = "acaba".

Left has 1 distinct character, right has 3. Not a match.

Step 2: Split after index 1. Left = "aa", Right = "caba".

Left still only has 1 distinct character (just a), right has 3. Not a match.

Step 3: Split after index 2. Left = "aac", Right = "aba".

Both sides have 2 distinct characters. This is a good split. Count = 1.

Step 4: Split after index 3. Left = "aaca", Right = "ba".

Both sides have 2 distinct characters again. Count = 2.

Step 5: Split after index 4. Left = "aacab", Right = "a".

Left has 3 distinct, right has 1. Not a match. Final answer: 2 good splits.

After checking all 5 possible split positions, we found 2 good splits (after index 2 and after index 3). Both give 2 distinct characters on each side.

Complexity analysis

Approach	Time	Space
Brute force (rebuild set each split)	O(n^2)	O(n)
Prefix-suffix distinct arrays	O(n)	O(n)

Time: O(n) where n is the length of the string. Three passes through the string: one left-to-right, one right-to-left, and one final scan of split positions. Each pass is O(n).

Space: O(n). Two arrays of length n store the prefix and suffix distinct counts. The sets used during construction hold at most 26 entries (for lowercase English letters), which is O(1), but the arrays themselves are O(n).

You can optimize space to O(1) by noting that left_distinct is monotonically non-decreasing and right_distinct is monotonically non-decreasing (when read right to left). This means you could use two pointers or binary search. But the O(n) space version is cleaner and easier to get right in an interview.

Practice this pattern

The building blocks

This problem decomposes into two reusable pieces that CodeBricks drills independently:

1. Prefix distinct count

The technique of scanning left to right while maintaining a running set, recording the size at each position. This gives you the number of distinct elements in every prefix of the array or string.

seen = set()
for i in range(n):
    seen.add(s[i])
    left_distinct[i] = len(seen)

This same building block appears in problems like First Unique Character in a String and Longest Substring Without Repeating Characters, where you need to track what you have seen so far from the left.

2. Suffix distinct count

The mirror image: scanning right to left with a running set. This gives you the distinct count for every suffix.

seen = set()
for i in range(n - 1, -1, -1):
    seen.add(s[i])
    right_distinct[i] = len(seen)

Pairing a prefix pass with a suffix pass is the core of the prefix-suffix decomposition pattern. Once you have both arrays, you can answer any "split the array/string at position i" question in O(1) per position.

Edge cases

Before submitting, verify your solution handles:

Two characters, same: s = "aa". Only one split: "a" | "a". Both have 1 distinct. Return 1.
Two characters, different: s = "ab". Only one split: "a" | "b". Both have 1 distinct. Return 1.
All identical characters: s = "aaaa". Every split gives 1 distinct on each side. Return 3 (there are n-1 = 3 split positions).
All unique characters: s = "abcde". The prefix distinct counts grow from 1 to 5. The suffix distinct counts shrink from 5 to 1. They cross in the middle. Check each position.
Length 2 (minimum): The smallest valid input. One split position, one comparison.

From understanding to recall

Reading through this solution, the prefix-suffix pattern feels obvious. But under time pressure, the details matter: remembering to use range(n - 1) (not range(n)) for the split check, remembering to compare left_distinct[i] with right_distinct[i + 1] (not right_distinct[i]), and remembering to reset the set between the two passes. These are the kinds of details that slip away without repetition. Spaced repetition locks them in by having you write the code from scratch at increasing intervals until the prefix-suffix decomposition is second nature.

Find All Anagrams in a String - Another string problem using character frequency tracking, with a sliding window instead of prefix-suffix arrays
Permutation in String - Fixed-size window frequency matching, a different angle on counting character properties across substrings
Longest Substring Without Repeating Characters - Uses a running set to track distinct characters, the same building block used in the prefix pass here

Practice this pattern

Why this problem matters

The approach

The solution

Visual walkthrough

Complexity analysis

The building blocks

1. Prefix distinct count

2. Suffix distinct count

Edge cases

From understanding to recall

Related posts