Sum of Beauty of All Substrings
You are given a string s consisting of lowercase English letters. The beauty of a string is defined as the difference between the maximum frequency and the minimum frequency of any character in that string. Your task is to return the sum of beauty of all substrings of s.
This is LeetCode 1781: Sum of Beauty of All Substrings, a medium problem that tests whether you can enumerate substrings efficiently while maintaining character frequency counts incrementally. The naive approach recomputes frequencies from scratch for every substring. The key insight is that extending a substring by one character only requires a single frequency update.
Why this problem matters
Frequency counting on strings is one of the most common building blocks in coding interviews. Problems like Group Anagrams, Find All Anagrams in a String, and Minimum Window Substring all rely on maintaining and querying a frequency array. This problem combines that skill with substring enumeration, which is another fundamental technique.
The real test here is not whether you can compute beauty for a single string. That part is easy. The challenge is doing it for all O(n^2) substrings without blowing up the time complexity. If you recompute frequencies from scratch for every substring, you pay O(n^3) total work. By updating the frequency array incrementally as you extend each substring, you bring it down to O(n^2).
The key insight
Fix a starting index i. As you slide the ending index j from i to n - 1, each new character s[j] extends the current substring by exactly one character. You only need to increment one entry in your frequency array. Then you scan the 26-letter frequency array to find the current max and min (among characters that appear at least once). Since the frequency array has a fixed size of 26, that scan is O(1).
This means each substring costs O(1) to process after the previous one. The total number of substrings is O(n^2), so the overall time is O(n^2 * 26), which simplifies to O(n^2).
The solution
def beautySum(s: str) -> int:
total = 0
n = len(s)
for i in range(n):
freq = [0] * 26
for j in range(i, n):
freq[ord(s[j]) - ord('a')] += 1
max_freq = max(freq)
min_freq = min(f for f in freq if f > 0)
total += max_freq - min_freq
return total
Here is what each piece does:
- The outer loop picks the starting index
ifor each family of substrings. - A fresh frequency array
freqof size 26 is created for each starting index, since a new family of substrings begins. - The inner loop extends the substring one character at a time. Each iteration increments the count for
s[j]in the frequency array. max(freq)finds the highest frequency across all 26 slots. Since most slots are zero, this is dominated by the active characters.min(f for f in freq if f > 0)finds the lowest frequency among characters that actually appear. You must filter out zeros because characters not in the substring should not count as having frequency zero.- The difference (max minus min) is the beauty of the current substring. It gets added to the running total.
The frequency array always has exactly 26 entries, so scanning it for max and min is constant time. This is why the overall complexity stays at O(n^2) even though you scan the array for every substring. If the alphabet were unbounded, you would need a different approach to track the max and min efficiently.
Visual walkthrough
Let's trace through a small example with s = "aab". There are six substrings total: three single characters, two length-2 substrings, and one length-3 substring. Single characters always have beauty 0, so the interesting work happens on the longer substrings.
Substring "aab" (i=0, j=2): extend from "aa" by adding 'b'
Frequencies: a=2, b=1. Max is 2, min is 1. Beauty = 2 - 1 = 1. Total so far: 0 + 1 = 1.
Substring "aa" (i=0, j=1): two characters, same letter
Only one distinct character (a=2). Max and min are both 2. Beauty = 0. Total stays at 1.
Substring "ab" (i=1, j=2): new starting index
Frequencies: a=1, b=1. Both characters appear once. Beauty = 1 - 1 = 0. Total stays at 1.
Single-character substrings: beauty is always 0
Substrings "a", "a", "b" each have one character. Max = min = 1. Beauty = 0. Final answer for "aab" = 1.
The final answer for "aab" is 1. Only the substring "aab" contributes a nonzero beauty. Every other substring has either one distinct character or characters with equal frequencies.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Nested loop with frequency array | O(n^2 * 26) = O(n^2) | O(26) = O(1) |
Time: The outer loop runs n times. For each starting index, the inner loop runs up to n times. Inside the inner loop, you do one frequency increment (O(1)) and one scan of the 26-element array (O(26) = O(1)). Total: O(n^2 * 26), which simplifies to O(n^2).
Space: The only extra storage is the 26-element frequency array, which is constant regardless of input size. So the space complexity is O(1).
The building blocks
1. Incremental frequency counting
freq = [0] * 26
for j in range(i, n):
freq[ord(s[j]) - ord('a')] += 1
Instead of rebuilding the frequency array from scratch for every substring, you extend the previous one by incrementing a single counter. This pattern appears everywhere: sliding window problems, anagram detection, and any time you enumerate substrings or subarrays while tracking element counts. The core idea is always the same. When you extend a range by one element, update the data structure by that one element rather than reprocessing the entire range.
2. Min and max over a fixed-size frequency array
max_freq = max(freq)
min_freq = min(f for f in freq if f > 0)
Computing the beauty requires finding the maximum and minimum nonzero frequencies. Because the alphabet is fixed at 26 characters, this scan is O(1). The filter if f > 0 is critical. Without it, you would always get a minimum of 0 for any character not in the substring. This small detail is a common source of bugs when solving this problem.
Edge cases
- All identical characters like
"aaaa": every substring has only one distinct character, so max equals min for every substring. The answer is 0. - All distinct characters like
"abcd": single-character substrings contribute 0. Longer substrings have at least two characters, and the beauty depends on whether frequencies diverge. For length-2 substrings, all characters appear once, so beauty is still 0. Longer substrings start producing nonzero values. - Two-character string like
"ab": three substrings total. The single characters contribute 0. The full string has each character appearing once, so beauty is 0. Answer is 0. - Length 1 like
"a": only one substring, which is a single character. Beauty is 0. Answer is 0.
From understanding to recall
You have read the nested-loop approach and it makes sense. Fix the start, extend the end, update one frequency, scan for max and min, accumulate. Clean and efficient.
But in an interview, the details matter. Forgetting to filter out zero-frequency characters when computing the minimum is a common mistake that produces wrong answers. Accidentally resetting the frequency array inside the inner loop instead of the outer loop is another trap. These are not conceptual gaps. They are recall failures under pressure.
Spaced repetition locks in the mechanics. You practice writing the nested loop with incremental frequency updates from scratch at increasing intervals. After a few rounds, the if f > 0 filter is automatic. You do not think about it. You just write it. That is the difference between understanding a solution and being able to produce it reliably.
Related posts
- Group Anagrams - frequency counting for character analysis
- Find All Anagrams in a String - sliding window with frequency arrays
- Longest Substring Without Repeating Characters - substring enumeration with character tracking
CodeBricks breaks the sum of beauty of all substrings problem into its incremental frequency counting and fixed-alphabet scanning building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a frequency-based substring question shows up in your interview, you do not hesitate. You just write it.