Contains Duplicate III: Bucket Sort for Value Ranges
LeetCode Contains Duplicate III is the hardest problem in the "Contains Duplicate" family. You are given an integer array nums, an integer indexDiff (often called k), and an integer valueDiff (often called t). Return true if there exist two distinct indices i and j such that abs(i - j) is at most indexDiff and abs(nums[i] - nums[j]) is at most valueDiff.
In other words, you need to find two elements that are close to each other both in position and in value.
The brute force approach checks every pair within a window of size k, giving O(n * k) time. A sorted structure (like a balanced BST or SortedList) can bring that down to O(n log k). But the bucket sort approach gets you to O(n) time, and it is one of the cleverest tricks in the LeetCode universe.
The bucket sort insight
Here is the key idea. If you create buckets of width valueDiff + 1, then two values that land in the same bucket are guaranteed to differ by at most valueDiff. You do not even need to check. The bucket size ensures it automatically.
What about values in adjacent buckets? They might differ by at most valueDiff, or they might not. So you check the adjacent buckets explicitly. But values that are two or more buckets apart? Their difference is guaranteed to be larger than valueDiff, so you can skip them entirely.
To handle the index constraint, you maintain a sliding window of size indexDiff. As you process each element, you remove the element that fell out of the window from its bucket. This keeps the bucket dictionary small and ensures you only compare elements within the allowed index range.
The bucket ID for a value is computed as value // w where w = valueDiff + 1. For negative numbers, you need a slight adjustment to handle the floor division correctly.
The solution
def containsNearbyAlmostDuplicate(nums: list[int], indexDiff: int, valueDiff: int) -> bool:
if valueDiff < 0:
return False
buckets = {}
w = valueDiff + 1
for i, num in enumerate(nums):
bucket_id = num // w
if num < 0:
bucket_id = (num + 1) // w - 1
if bucket_id in buckets:
return True
if bucket_id - 1 in buckets and abs(num - buckets[bucket_id - 1]) <= valueDiff:
return True
if bucket_id + 1 in buckets and abs(num - buckets[bucket_id + 1]) <= valueDiff:
return True
buckets[bucket_id] = num
if i >= indexDiff:
del buckets[nums[i - indexDiff] // w if nums[i - indexDiff] >= 0 else (nums[i - indexDiff] + 1) // w - 1]
return False
Let's break down what each part does:
- Early exit: if
valueDiffis negative, no two values can satisfy the condition, so returnFalseimmediately. - Bucket width:
w = valueDiff + 1. This ensures any two values in the same bucket differ by at mostvalueDiff. - Bucket ID: for non-negative numbers,
num // w. For negative numbers, the formula(num + 1) // w - 1corrects the floor division behavior. - Three checks: same bucket (guaranteed match), left adjacent bucket (explicit check), right adjacent bucket (explicit check).
- Window maintenance: after processing index
i, ifi >= indexDiff, remove the element ati - indexDifffrom the bucket dictionary.
Step-by-step walkthrough
Let's trace through nums = [1, 5, 9, 1, 5, 9] with indexDiff = 2 and valueDiff = 3. The bucket width is w = 4.
Step 1: i=0, nums[0]=1. Bucket ID = 1 // 4 = 0
Bucket 0 is empty. No adjacent buckets to check. Insert 1 into bucket 0.
Step 2: i=1, nums[1]=5. Bucket ID = 5 // 4 = 1
Bucket 1 is empty. Check adjacent bucket 0: |5 - 1| = 4, which is greater than valueDiff=3. No match. Insert 5 into bucket 1.
Step 3: i=2, nums[2]=9. Bucket ID = 9 // 4 = 2. Remove i=0 (outside window).
Window slides: remove nums[0]=1 from bucket 0 (i - indexDiff = 0). Bucket 2 is empty. Check adjacent bucket 1: |9 - 5| = 4 > 3. No match. Insert 9.
Step 4: i=3, nums[3]=1. Bucket ID = 1 // 4 = 0. Remove i=1.
Remove nums[1]=5 from bucket 1. Bucket 0 is empty. No adjacent buckets with values. Insert 1 into bucket 0.
Step 5: i=4, nums[4]=5. Bucket ID = 5 // 4 = 1. Check adjacent bucket 0.
Remove nums[2]=9 from bucket 2. Bucket 1 is empty. Check adjacent bucket 0: |5 - 1| = 4 > 3. No match. Insert 5 into bucket 1. (Continue scanning; no duplicate found within constraints for this example.)
The bucket dictionary never holds more than indexDiff entries at any time. Each insertion, deletion, and lookup is O(1) in a hash map. That is what makes the entire algorithm O(n).
Complexity analysis
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | Single pass through the array. Each element involves O(1) bucket operations. |
| Space | O(min(n, k)) | The bucket dictionary holds at most indexDiff entries at any time. |
This is a significant improvement over the O(n log k) sorted container approach and the O(n * k) brute force.
Building blocks
This problem combines three reusable techniques:
1. Bucket sort mapping
The trick of dividing values by a bucket width to group nearby values together. This same idea appears in Maximum Gap and other problems where you need to reason about value ranges rather than exact matches.
bucket_id = num // w
2. Sliding window with a dictionary
Instead of a set or list, you maintain a dictionary keyed by bucket ID. The window slides forward by adding the new element's bucket and removing the oldest element's bucket. This is the same sliding window pattern you see in substring problems, but applied to a hash map instead of a frequency counter.
buckets[bucket_id] = num
if i >= indexDiff:
del buckets[old_bucket_id]
3. Adjacent bucket checking
After placing a value in its bucket, you check the two neighboring buckets for potential matches. This "check neighbors" pattern shows up in grid problems, interval problems, and anywhere you partition a space into regions.
if bucket_id - 1 in buckets and abs(num - buckets[bucket_id - 1]) <= valueDiff:
return True
Edge cases
Watch out for these in interviews:
valueDiff = 0: the bucket width becomes 1, and you are effectively looking for exact duplicates within a window. The algorithm still works correctly since same-bucket means same value.- Negative numbers: floor division in Python rounds toward negative infinity, which shifts bucket boundaries. The
(num + 1) // w - 1formula corrects for this so that values near zero end up in the right buckets. indexDiff = 0: you can never have two distinct indices withabs(i - j)equal to 0 that are actually different indices. The window is empty, so the function returnsFalse. (Actually,indexDiff >= 1is guaranteed by the constraints, but it is good to think about.)- Single element array: with fewer than 2 elements, you cannot find a pair. The loop simply ends without returning
True.
The trickiest part of this problem is handling negative numbers. If you forget the adjustment, values like -1 and 0 might end up in the wrong buckets. Test your solution with negative inputs to catch this.
From understanding to recall
Reading through the bucket sort trick once is not enough. A week from now, if you see a problem that requires grouping values into ranges with a sliding window, will the bucket ID formula come to mind automatically?
That is the gap between understanding and recall. You understand the approach right now, but recalling it under pressure requires repetition. Spaced repetition drills the bucket mapping, the adjacent check, and the window eviction as separate building blocks. After a few review cycles, the pattern fires on its own when you see the right signals: "value range constraint" plus "index distance constraint."
Related problems
- Contains Duplicate covers the basic set-based duplicate detection that forms the foundation for this problem.
- Contains Duplicate II adds the index distance constraint with a simpler sliding window set approach.
- The Sliding Window Pattern explains the general technique of maintaining a window as you iterate through an array.