Longest Nice Substring: Divide and Conquer Pattern
A string is "nice" if, for every letter it contains, both the uppercase and lowercase versions of that letter also appear in the string. Given a string s, return the longest substring of s that is nice. If there are multiple answers of the same length, return the one that occurs first. If there are no nice substrings, return an empty string.
This is LeetCode 1763: Longest Nice Substring, and it is an excellent introduction to the divide-and-conquer pattern applied to string problems.
Why this problem matters
Most "longest substring" problems use sliding window. This one does not. It uses divide and conquer, which makes it a valuable pattern-breaker in your problem-solving toolkit. You learn to identify when a character cannot belong to any valid answer, use that character as a natural split point, and recursively solve smaller subproblems. That same reasoning shows up in problems like sorting (quicksort partitions around a pivot), segment trees, and other recursive string decompositions.
The problem also exercises set-based character analysis. You need to efficiently determine which characters have both their upper and lower case forms present. This combines well with bit manipulation or simple set operations.
The key insight
If a character in the string does not have its counterpart (its upper or lower case version), then that character cannot be part of any nice substring. It acts as a wall. No nice substring can cross it.
This gives you a natural divide-and-conquer strategy: scan the string for the first "bad" character (one missing its counterpart), split the string at that position, and recursively find the longest nice substring in each half. Then return whichever half produced the longer result.
The base case is when every character in the substring has its counterpart present. In that case, the entire substring is nice, and you return it. The other base case is when the string has length less than 2, which cannot be nice.
The solution
def longest_nice_substring(s: str) -> str:
if len(s) < 2:
return ""
char_set = set(s)
for i, ch in enumerate(s):
if ch.swapcase() not in char_set:
left = longest_nice_substring(s[:i])
right = longest_nice_substring(s[i + 1:])
return left if len(left) >= len(right) else right
return s
Let's walk through what each piece does.
The base case checks if the string has fewer than 2 characters. A single character can never be nice because it cannot contain both its upper and lower case forms. An empty string is trivially not nice either.
Next, you build a set of all characters in the current substring. This lets you check in O(1) whether a character's counterpart exists.
The loop scans through the string looking for the first "bad" character, one whose swapcase() result is not in the set. When you find it, you split the string into everything before that character and everything after it. You recursively solve both halves and return whichever result is longer. The >= comparison ensures that when both halves tie in length, you return the leftmost one (the one that occurs first).
If the loop completes without finding any bad character, every character has its counterpart. The entire string is nice, so you return it as-is.
The swapcase() method converts a lowercase letter to uppercase and vice versa. Using ch.swapcase() not in char_set is a clean way to check if a character is missing its counterpart without writing separate upper/lower case logic.
Visual walkthrough
Let's trace through the divide-and-conquer process on the string "YazaAay" step by step.
Step 1: Check "YazaAay". Find a character without its counterpart.
"Y" (index 0) has no lowercase "y" at index 6 in the set check... wait, "y" is at index 6. But "z" has no uppercase "Z". Split at "z" (index 2).
Step 2: Recurse on left half "Ya" (indices 0-1).
"Y" has no "y", and "a" has no "A". No substring of length >= 2 is nice. Left half returns "".
Step 3: Recurse on right half "aAay" (indices 3-6). Find a bad character.
"y" has no "Y" in this substring. Split at "y" (index 3 of this substring).
Step 4: Recurse on "aAa". Check every character.
"a" has "A", and "A" has "a". Every character has both cases. The entire substring is nice!
Step 5: Recurse on "y". Single character, not nice. Return "".
A single character can never be nice. Base case returns empty string.
Step 6: Compare results. Left = "", Right "aAa" vs "". Return "aAa".
The longest nice substring found across all recursive calls is "aAa" with length 3. Final answer: "aAa".
Notice how each split removes a problematic character and reduces the problem size. The recursion bottoms out either when the entire substring is nice or when it is too short to be nice. The longest nice substring found across all branches is the final answer.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Divide and conquer | O(n * 26) | O(n) |
Time is O(n * 26) in the worst case. At each level of recursion, you do O(n) work scanning the string and building the character set. Each split removes at least one distinct character from consideration (the "bad" character that caused the split). Since there are at most 26 distinct letters (52 if you count both cases, but each split eliminates a letter pair), the recursion depth is bounded by 26. So the total work is O(n) per level times at most 26 levels.
Space is O(n) for the recursion stack and the substring copies. Each recursive call creates substrings, and the maximum depth of recursion is 26. In Python, string slicing creates copies, so each level uses space proportional to the substring length.
The building blocks
1. Character set analysis
The pattern of checking whether both cases exist for every character in a string:
char_set = set(s)
for ch in s:
if ch.swapcase() not in char_set:
return False
return True
This is the core validity check. You build a set once, then verify each character in O(1). This same technique appears in any problem that asks you to validate character coverage, like checking if a string contains all vowels, if a pangram is complete, or if a ransom note can be constructed from a magazine.
2. Divide and conquer splitting
The pattern of finding a disqualifying element and recursing on both sides:
for i, ch in enumerate(s):
if is_bad(ch):
left = solve(s[:i])
right = solve(s[i + 1:])
return best(left, right)
return s
This is the same structure used in quicksort (partition around a pivot, recurse on both sides) and in problems like "Different Ways to Add Parentheses" (LeetCode 241), where you split an expression at each operator. The key idea is that you find a natural boundary, something that cannot be part of the answer, and use it to decompose the problem.
Edge cases
Before submitting, think through these scenarios:
- Single character:
"a"returns"". One character can never be nice. - All same case:
"aaa"returns"". No uppercase counterpart for any character. - Already nice:
"aAbB"returns"aAbB". Every character has its counterpart, so the full string is the answer. - Empty string: returns
"". Nothing to check. - Multiple nice substrings of equal length:
"aAbBcCxXdD"is entirely nice and returns the full string. But"aAxbBb"would split at"x"and return the longer nice substring. - Nested nice substrings: the recursion naturally finds the longest one because it always compares left and right results by length.
From understanding to recall
You now know how to spot the divide-and-conquer opportunity: if a character cannot belong to any valid answer, use it as a split point. The code is short and the logic is clean. But under interview pressure, the details matter. Do you check swapcase or manually toggle case? Do you return left or right when they tie? Do you split at the first bad character or all of them?
Spaced repetition closes these gaps. You practice writing the character set check, the recursive split, and the base case from memory at increasing intervals. After a few rounds, you see "find the longest substring satisfying some property" and your mind immediately evaluates whether sliding window or divide and conquer is the right tool. For this problem, the answer is divide and conquer, and the template flows out automatically.
Related posts
- Longest Substring Without Repeating Characters - Another "longest substring" problem that uses a sliding window instead of divide and conquer
- Reverse String - Foundational string manipulation
- Valid Palindrome - Case-sensitive string analysis
CodeBricks breaks Longest Nice Substring into its character set analysis and divide-and-conquer splitting building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a divide-and-conquer string problem shows up in your interview, you do not think about it. You just write it.