Maximum Number of Occurrences of a Substring: Sliding Window with Frequency Map
Given a string s, return the maximum number of occurrences of any substring that satisfies both of these conditions: the number of unique characters in the substring is at most maxLetters, and the substring length is between minSize and maxSize inclusive. This is LeetCode 1297: Maximum Number of Occurrences of a Substring.
For example, given s = "aababcaab", maxLetters = 2, minSize = 3, maxSize = 4, the answer is 2. The substring "aab" has at most 2 unique characters and appears twice.
Why this problem matters
This problem combines two fundamental patterns: fixed-size sliding window and frequency counting with a hash map. On the surface, it looks like you need to check all valid lengths from minSize to maxSize, which could be expensive. But a single observation collapses the search space down to just one length, making the solution clean and efficient.
Problems that disguise a simpler solution behind an intimidating parameter space show up often in interviews. Recognizing when a constraint is redundant is the kind of insight that separates a brute-force attempt from an elegant O(n) pass.
The key insight
You only need to check substrings of length minSize. You can completely ignore maxSize.
Why? Think about it this way. If a substring of length k (where k is greater than minSize) appears t times in the string, then the first minSize characters of that substring also appear at least t times, because they show up inside every occurrence of the longer substring. The shorter substring might also appear in other positions on its own, so its count is greater than or equal to the count of the longer one. And since the shorter substring uses a subset of the characters, it has at most as many unique characters as the longer one. So if the longer one satisfies maxLetters, the shorter one does too.
This means the answer at length minSize is always at least as large as the answer at any longer length. Checking longer substrings can never improve the result. The maxSize parameter is a red herring.
The solution
from collections import Counter
def max_freq(s: str, max_letters: int, min_size: int, max_size: int) -> int:
count = Counter()
for i in range(len(s) - min_size + 1):
sub = s[i:i + min_size]
if len(set(sub)) <= max_letters:
count[sub] += 1
return max(count.values()) if count else 0
The solution slides a fixed-size window of length minSize across the string. At each position, it extracts the substring, checks whether the number of unique characters is within maxLetters, and if so, increments that substring's count in the frequency map. At the end, the answer is the maximum value in the map.
There is no need to expand or shrink the window dynamically. The window size is fixed at minSize, so you simply slide it one position at a time from left to right.
The maxSize parameter is irrelevant to the optimal solution. Any valid longer substring contains a valid shorter one with equal or greater frequency. This is one of those problems where recognizing what you can ignore is more important than the code itself.
Visual walkthrough
Step 1: Key insight. Only check substrings of length minSize.
Any valid substring of length greater than minSize contains a shorter valid substring of length minSize. That shorter substring appears at least as often (it shows up inside every occurrence of the longer one, plus possibly on its own). So the maximum frequency can always be achieved at length minSize. We can ignore maxSize entirely.
Step 2: Window at index 0. Substring = "aab".
freq: "aab":1
2 unique chars, valid. Record "aab" with count 1.
Step 3: Window at index 1. Substring = "aba".
freq: "aab":1, "aba":1
2 unique chars, valid. Record "aba" with count 1.
Step 4: Window at index 2. Substring = "bab".
freq: "aab":1, "aba":1, "bab":1
2 unique chars, valid. Record "bab" with count 1.
Step 5: Window at index 3. Substring = "abc".
freq: "aab":1, "aba":1, "bab":1
3 unique chars, exceeds maxLetters. Skip this substring.
Step 6: Window at index 4. Substring = "bca".
freq: "aab":1, "aba":1, "bab":1
3 unique chars, exceeds maxLetters. Skip again.
Step 7: Window at index 5. Substring = "caa".
freq: "aab":1, "aba":1, "bab":1, "caa":1
2 unique chars, valid. Record "caa" with count 1.
Step 8: Window at index 6. Substring = "aab". Found it again!
freq: "aab":2, "aba":1, "bab":1, "caa":1
2 unique chars, valid. "aab" now has count 2. This is the maximum. Answer: 2.
Notice how the window slides one position at a time, always of size minSize = 3. When the substring has too many unique characters (like "abc" with 3 unique), we skip it. When it is valid, we add it to the frequency map. The substring "aab" shows up at index 0 and again at index 6, giving us the maximum count of 2.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Sliding window + hash map | O(n * minSize) | O(n * minSize) |
Time is O(n * minSize). We slide the window across n positions. At each position, extracting the substring and computing the unique character set both take O(minSize) work. Since minSize is at most 26 in practice (bounded by the alphabet), this is effectively O(n).
Space is O(n * minSize). The frequency map can hold up to n distinct substrings, each of length minSize. Again, with a small minSize, this behaves like O(n) in practice.
The building blocks
1. Fixed-size sliding window
The pattern of sliding a window of constant size across a string, processing each position:
for i in range(len(s) - window_size + 1):
window = s[i:i + window_size]
process(window)
This is simpler than the variable-size sliding window used in problems like Longest Substring Without Repeating Characters. The window never grows or shrinks. You just slide it one step at a time. The same pattern appears in Find All Anagrams in a String, Minimum Size Subarray Sum with a fixed target, and any problem where the window length is given.
2. Substring frequency counting
The pattern of counting how many times each substring appears using a hash map:
count = Counter()
for each valid_substring:
count[valid_substring] += 1
best = max(count.values())
This is a direct application of the frequency map building block. You accumulate counts and then query for the maximum. The same idea shows up in Group Anagrams (grouping by sorted key), Repeated DNA Sequences (counting 10-letter windows), and any problem that asks "which pattern appears most often."
Edge cases
- All characters the same:
s = "aaaa",minSize = 2,maxLetters = 1. Every window is"aa", so the answer islen(s) - minSize + 1. maxLetters = 1: only substrings made of a single repeated character are valid. Check that your unique character counting handles this correctly.- No valid substrings: if every window of size
minSizehas more thanmaxLettersunique characters, the frequency map is empty. Return0. minSizeequals the string length: only one window exists. The answer is either0or1.minSizeequalsmaxSize: this does not change anything since we already only checkminSize.
From understanding to recall
You have seen the key insight: ignore maxSize and only check minSize. The code is short, just a loop with a substring extraction, a uniqueness check, and a frequency counter. But in an interview, the hard part is not the code. It is convincing yourself (and your interviewer) that checking only minSize is sufficient.
Spaced repetition helps you internalize that reasoning. You practice explaining why shorter substrings always have equal or greater frequency, why they have fewer or equal unique characters, and why that makes maxSize redundant. After a few rounds, the argument flows naturally. You see a problem with a suspicious extra parameter, and you immediately ask: "Can I reduce this to a simpler case?"
Related posts
- Longest Substring Without Repeating Characters - Classic sliding window with character tracking
- Find All Anagrams in a String - Fixed-size sliding window with frequency comparison
- Minimum Window Substring - Variable-size sliding window, the harder version of this pattern
CodeBricks breaks this problem into its fixed-size sliding window and frequency counting building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a sliding window problem shows up in your interview, you recognize the shape immediately and write the solution without hesitation.