Skip to content
← All posts

Split a String Into the Max Number of Unique Substrings: Backtracking with a Set

6 min read
leetcodeproblemmediumstringsbacktracking

You are given a string s and need to split it into as many substrings as possible, with one constraint: every substring in the split must be unique. No two pieces can be the same. This is LeetCode 1593: Split a String Into the Max Number of Unique Substrings.

The problem

Given a string s, split it into the maximum number of non-empty substrings such that all the substrings are unique. You can split the string at any position, but every piece must be different from every other piece. Return the maximum number of unique substrings you can achieve.

Examples:

  • "ababccc" returns 5. One valid split: ["a", "b", "ab", "c", "cc"].
  • "aba" returns 2. Splitting into ["a", "b", "a"] is invalid because "a" appears twice. One valid split: ["a", "ba"].
  • "aa" returns 1. You cannot split into ["a", "a"] since the substrings are not unique. The best you can do is ["aa"].
s = "abab"abab5:"a""b""a"dup!4:"a""ba""b"
For s = "abab", splitting into ["a", "b", "a", "b"] fails because "a" and "b" each appear twice. Splitting into ["a", "ba", "b"] produces 3 unique substrings, which is the maximum.

The brute force approach

A naive approach would generate every possible way to partition the string and then check each partition for uniqueness. A string of length n has 2^(n-1) possible partitions (you can place or skip a divider between each pair of adjacent characters). For each partition you would check whether all pieces are distinct. This works, but it considers many partitions that are clearly invalid. You might place a divider that creates a duplicate substring early on, then waste time exploring every possible arrangement of the remaining characters.

The key insight

You do not need to generate all partitions and filter. Instead, you can build partitions one piece at a time using backtracking. At each position, try every possible next substring. Before adding it, check whether it already exists in a set of substrings you have used so far. If it does, skip it. If it does not, add it to the set, recurse on the remainder of the string, then remove it from the set when you backtrack.

This prunes the search tree early. The moment a substring is already in the set, you skip that entire branch without exploring any of its descendants.

The set of used substrings acts as both your constraint checker and your pruning mechanism. If a prefix at position i is already in the set, you do not recurse. This saves you from exploring all downstream partitions of the remaining string.

Step-by-step walkthrough

Let's trace the algorithm on s = "abab". At each step, you pick a prefix, check the set, and either recurse or skip.

Step 1: Try prefix "a" from index 0

abab"a"

Set: {"a"}

"a" is not in the set. Add it and recurse from index 1.

Step 2: Try prefix "b" from index 1

abab"a""b"

Set: {"a", "b"}

"b" is not in the set. Add it and recurse from index 2.

Step 3: Try prefix "a" from index 2. Duplicate found.

abab"a""b"

Set: {"a", "b"}

"a" is already in the set. Skip this split. Try "ab" instead.

Step 4: Try prefix "ab" from index 2

abab"a""b""ab"

Set: {"a", "b", "ab"}

"ab" is not in the set. Add it. Index reaches end. Record count = 3.

Step 5: Backtrack. Remove "b". Try "ba" from index 1.

abab"a""ba"

Set: {"a", "ba"}

"ba" is not in the set. Add it and recurse from index 3.

Step 6: Try prefix "b" from index 3

abab"a""ba""b"

Set: {"a", "ba", "b"}

"b" is not in the set. Add it. Index reaches end. Record count = 3. Max stays at 3.

Result: Maximum unique substrings = 3

abab"a""b""ab"

Two splits reach count 3: ["a", "b", "ab"] and ["a", "ba", "b"]. Both are valid answers.

The code

def maxUniqueSplit(s: str) -> int:
    n = len(s)
    best = [1]
    seen = set()

    def backtrack(start: int):
        if start == n:
            best[0] = max(best[0], len(seen))
            return
        if len(seen) + (n - start) <= best[0]:
            return
        for end in range(start + 1, n + 1):
            sub = s[start:end]
            if sub not in seen:
                seen.add(sub)
                backtrack(end)
                seen.remove(sub)

    backtrack(0)
    return best[0]

The function starts at index 0 with an empty set. At each call to backtrack(start), the loop tries every substring from s[start:end] where end ranges from start + 1 to n. If the substring is not in seen, it gets added and the function recurses from end. After returning, the substring is removed from seen (the "unchoose" step).

The base case fires when start == n, meaning the entire string has been consumed. At that point, the size of seen is a valid answer, so you update best if it is larger.

The pruning line if len(seen) + (n - start) <= best[0] is an optimization. Even if you split every remaining character into its own single-character substring, you could add at most n - start more pieces. If that cannot beat the current best, you stop early. This upper-bound pruning dramatically reduces the number of branches explored in practice.

The pruning check len(seen) + (n - start) <= best[0] is what makes this solution practical. Without it, the worst case is still exponential, but with it you cut off most branches early. This is a common technique in backtracking problems where you are maximizing or minimizing a count.

Complexity analysis

ApproachTimeSpace
Brute force (all partitions)O(n * 2^n)O(n)
Backtracking with pruningO(2^n) worst caseO(n)

Time: In the worst case, backtracking explores O(2^n) branches because each of the n-1 gaps between characters can either be a split point or not. However, the set-based duplicate check and the upper-bound pruning cut off many branches in practice. For typical inputs (strings with repeated characters), the actual number of branches explored is far smaller.

Space: The recursion stack goes at most n levels deep (one level per character). The set holds at most n substrings. Each substring can be up to length n, so the total space for the set contents is O(n^2) in the worst case. The dominant practical cost is O(n) for the stack depth.

The building blocks

Substring partitioning backtracking

The recursive structure here is the same choose-explore-unchoose skeleton used in Palindrome Partitioning, Word Break, and similar string splitting problems. At each position, you try every possible prefix, check a condition, and recurse on the remainder:

def backtrack(start, path):
    if start == n:
        record(path)
        return
    for end in range(start + 1, n + 1):
        piece = s[start:end]
        if valid(piece):
            choose(piece)
            backtrack(end)
            unchoose(piece)

What changes between problems is the valid check and the choose/unchoose logic. In Palindrome Partitioning, valid checks for palindromes. In this problem, valid checks set membership. The skeleton is identical.

Set-based constraint tracking

Using a set to track which substrings have been used is the same pattern you see in Subsets, permutations, and combination problems. The set replaces the "visited" array or "used" bitmask from other backtracking problems. You add an element before recursing and remove it after. This ensures that when you explore different branches of the recursion tree, the set reflects only the choices made along the current path.

Edge cases

Single character: "a" returns 1. There is only one possible split: the string itself.

All identical characters: "aaaa" returns 1 if you are forced into ["aaaa"], but actually you can split into ["a", "aa", "aaa"] ... wait, that is only 3 characters. For "aaaa", try ["a", "aa", "a"], but "a" repeats. The best is ["a", "aaa"] or ["a", "aa", "a"] (invalid). So the answer is 2: ["a", "aaa"]. The more identical characters you have, the fewer unique substrings you can create because each new piece must be a different run length.

All distinct characters: "abcdef" returns 6. Every single character is unique, so splitting each character individually gives the maximum. When all characters are different, single-character splits always win.

Length 1: The smallest input is a single character. The answer is always 1.

From understanding to recall

You can see how the backtracking search works. You understand the role of the set, the pruning optimization, and the choose-explore-unchoose pattern. But in an interview, can you write the backtrack function, the set operations, and the pruning check without hesitation?

The pieces here are not complicated individually. The substring partitioning skeleton is the same one from Palindrome Partitioning and Word Break. The set-based tracking is the same as what you do in Subsets and permutations. The upper-bound pruning is a one-line optimization. But combining all three correctly under pressure takes practice.

This is exactly the kind of problem where spaced repetition shines. You do not need to memorize the solution. You need to have drilled each building block enough times that the combination is automatic. When you see "split a string" and "unique pieces," the backtracking skeleton and the set constraint should surface immediately, without you having to reconstruct them from scratch.

Related posts

  • Palindrome Partitioning - Same substring partitioning backtracking skeleton, with palindrome checking as the constraint instead of set uniqueness.
  • Word Break - Another string splitting problem where you try every prefix and check a condition (dictionary membership) before recursing.
  • Subsets - The foundational backtracking exploration pattern. Compare the include/skip decision tree to the prefix partitioning pattern here.