Subarrays with K Different Integers: The At-Most Trick
LeetCode's Subarrays with K Different Integers (problem 992) asks you to count the number of contiguous subarrays that contain exactly k different integers. It is rated hard, but the solution becomes clean once you see the trick that converts "exactly k" into a difference of two "at most" problems.
The problem
You are given an integer array nums and an integer k. Return the number of subarrays where the count of distinct integers is exactly k.
For example, with nums = [1, 2, 1, 2, 3] and k = 2, the answer is 7. The valid subarrays (by index range) are: [0,1], [0,2], [0,3], [1,2], [1,3], [2,3], and [3,4]. Each of these contains exactly 2 distinct integers.
The brute force approach
The most direct approach is to enumerate every subarray and count its distinct elements. For each starting index i, expand the subarray to the right, maintaining a set of elements seen so far. Whenever the set has exactly k elements, increment the count.
This runs in O(n^2) time because there are O(n^2) subarrays and each one requires at most O(n) work to check, though with an incremental set you can bring each check down to O(1). The overall time is still O(n^2), which is too slow when n can be up to 20,000.
The key insight
Counting subarrays with exactly k distinct integers directly with a sliding window is tricky. When you have too many distinct values, you know to shrink the window. But when you have exactly k, you cannot tell whether to expand or shrink, because both could yield more valid subarrays.
The trick is to reframe the problem:
exactly(k) = atMost(k) - atMost(k - 1)
If you can count the number of subarrays with at most k distinct integers, then subtracting the count for at most k - 1 gives you exactly those subarrays with precisely k distinct values.
The "at most" version is easy to solve with a standard sliding window. The window expands to the right, and whenever the number of distinct values exceeds the limit, you shrink from the left. At each position of the right pointer, every subarray from left to right is valid, so you add (right - left + 1) to the count.
Step-by-step walkthrough
Let's trace through nums = [1, 2, 1, 2, 3] with k = 2. We run the atMost helper twice: once with k = 2 and once with k = 1.
Step 1: atMost(k=2) - expand right to index 3
Window [0..3] = [1,2,1,2] has 2 distinct values. Every subarray ending at each right position with at most 2 distinct values is counted: subarrays from left..right contribute (right - left + 1) each step.
Step 2: atMost(k=2) - expand right to index 4
Adding nums[4]=3 gives 3 distinct. Shrink left until distinct is at most 2. Remove 1 then 2. Window becomes [2..4] = [1,2,3]. Still 3 distinct, shrink again to [3..4] = [2,3]. Now 2 distinct, add (4-3+1)=2. Total atMost(2) = 12.
Step 3: atMost(k-1=1) - full pass
Run the same sliding window with k=1. Only single-value windows survive: [1], [2], [1], [2], [3]. atMost(1) = 5.
Step 4: Final answer = atMost(2) - atMost(1)
exactly(2) = atMost(2) - atMost(1) = 12 - 5 = 7. The 7 subarrays with exactly 2 distinct integers.
The code
def subarraysWithKDistinct(nums, k):
def atMost(k):
count = {}
left = 0
result = 0
for right in range(len(nums)):
count[nums[right]] = count.get(nums[right], 0) + 1
while len(count) > k:
count[nums[left]] -= 1
if count[nums[left]] == 0:
del count[nums[left]]
left += 1
result += right - left + 1
return result
return atMost(k) - atMost(k - 1)
The atMost helper maintains a hash map count that tracks the frequency of each integer in the current window. When the number of distinct keys exceeds k, the left pointer advances, decrementing frequencies and removing keys that drop to zero. At each step, right - left + 1 counts every valid subarray ending at right.
The outer function calls atMost twice and returns the difference, which gives the count of subarrays with exactly k distinct values.
The atMost helper is reusable. Any time you need to count subarrays with a constraint on the number of distinct elements, you can use this same function. Problems like "Fruit Into Baskets" (at most 2 distinct) are just special cases of this pattern.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Brute force (check all subarrays) | O(n^2) | O(n) |
| At-most sliding window (two passes) | O(n) | O(k) |
Each call to atMost does a single pass through the array. The left pointer only moves forward, so each element is added and removed from the hash map at most once per call. Two calls to atMost means two passes, which is still O(n). The hash map holds at most k + 1 keys before shrinking, so space is O(k).
The building blocks
The at-most transformation
Many counting problems become easier when you replace "exactly x" with "at most x" minus "at most x - 1". This works because the set of subarrays with at most k distinct values is a superset of those with at most k - 1. The difference is precisely the subarrays with exactly k.
This same idea applies beyond distinct-element counting. Whenever a sliding window constraint is monotonic (adding elements can only increase the count of something), the at-most version is solvable with a two-pointer approach, and the "exactly" version follows by subtraction.
Frequency map with cleanup
The hash map in this solution tracks element frequencies, and you delete keys when their count hits zero. This cleanup step is essential because len(count) tells you the number of distinct elements in the window. If you left zero-count keys in the map, the distinct count would be wrong.
This pattern appears in many sliding window problems: Minimum Window Substring, Longest Substring with At Most K Distinct Characters, and Fruit Into Baskets all use a frequency map with the same add/remove/cleanup logic.
Edge cases
- k equals 1: every subarray of identical consecutive elements counts. The
atMost(0)call returns 0 (no valid subarrays with 0 distinct values), so the answer is justatMost(1). - k equals the number of distinct values in the entire array: the answer includes every subarray that uses all distinct values.
atMost(k)counts all subarrays (since every subarray has at mostkdistinct), andatMost(k-1)removes those missing at least one value. - All elements are the same:
[5, 5, 5]withk = 1returnsn * (n + 1) / 2because every subarray has exactly 1 distinct element. Withk = 2, the answer is 0. - Array of length 1: a single element
[x]withk = 1returns 1. With anyk > 1, it returns 0.
From understanding to recall
The core of this problem is one idea: transform "exactly k" into "atMost(k) - atMost(k-1)". Once you internalize that, the implementation is a standard sliding window with a frequency map. The tricky part is not the code itself but remembering to reach for the at-most trick when you see an "exactly" constraint on distinct elements.
The building blocks here, the at-most transformation and the frequency-map sliding window, show up repeatedly in array problems. Being able to write the atMost helper from memory means you can solve this problem and several related ones in a few minutes.
Related posts
- Fruit Into Baskets - A sliding window problem counting at most 2 distinct elements, which is the k=2 case of this pattern
- Longest Substring Without Repeating Characters - Classic sliding window with a distinct character constraint
- Minimum Window Substring - Another sliding window that tracks character frequencies with a hash map