Contains Duplicate III: Bucket Sort for Value Ranges

May 14, 20265 min read

leetcodeproblemhardarrayssliding-window

LeetCode Contains Duplicate III is the hardest problem in the "Contains Duplicate" family. You are given an integer array nums, an integer indexDiff (often called k), and an integer valueDiff (often called t). Return true if there exist two distinct indices i and j such that abs(i - j) is at most indexDiff and abs(nums[i] - nums[j]) is at most valueDiff.

In other words, you need to find two elements that are close to each other both in position and in value.

The brute force approach checks every pair within a window of size k, giving O(n * k) time. A sorted structure (like a balanced BST or SortedList) can bring that down to O(n log k). But the bucket sort approach gets you to O(n) time, and it is one of the cleverest tricks in the LeetCode universe.

Values map to buckets of width t+1. Two values in the same bucket are guaranteed to differ by at most t. Adjacent buckets need an explicit check.

The bucket sort insight

Here is the key idea. If you create buckets of width valueDiff + 1, then two values that land in the same bucket are guaranteed to differ by at most valueDiff. You do not even need to check. The bucket size ensures it automatically.

What about values in adjacent buckets? They might differ by at most valueDiff, or they might not. So you check the adjacent buckets explicitly. But values that are two or more buckets apart? Their difference is guaranteed to be larger than valueDiff, so you can skip them entirely.

To handle the index constraint, you maintain a sliding window of size indexDiff. As you process each element, you remove the element that fell out of the window from its bucket. This keeps the bucket dictionary small and ensures you only compare elements within the allowed index range.

The bucket ID for a value is computed as value // w where w = valueDiff + 1. For negative numbers, you need a slight adjustment to handle the floor division correctly.

Practice the bucket sort pattern

The solution

def containsNearbyAlmostDuplicate(nums: list[int], indexDiff: int, valueDiff: int) -> bool:
    if valueDiff < 0:
        return False
    buckets = {}
    w = valueDiff + 1

    for i, num in enumerate(nums):
        bucket_id = num // w
        if num < 0:
            bucket_id = (num + 1) // w - 1

        if bucket_id in buckets:
            return True
        if bucket_id - 1 in buckets and abs(num - buckets[bucket_id - 1]) <= valueDiff:
            return True
        if bucket_id + 1 in buckets and abs(num - buckets[bucket_id + 1]) <= valueDiff:
            return True

        buckets[bucket_id] = num
        if i >= indexDiff:
            del buckets[nums[i - indexDiff] // w if nums[i - indexDiff] >= 0 else (nums[i - indexDiff] + 1) // w - 1]

    return False

Let's break down what each part does:

Early exit: if valueDiff is negative, no two values can satisfy the condition, so return False immediately.
Bucket width: w = valueDiff + 1. This ensures any two values in the same bucket differ by at most valueDiff.
Bucket ID: for non-negative numbers, num // w. For negative numbers, the formula (num + 1) // w - 1 corrects the floor division behavior.
Three checks: same bucket (guaranteed match), left adjacent bucket (explicit check), right adjacent bucket (explicit check).
Window maintenance: after processing index i, if i >= indexDiff, remove the element at i - indexDiff from the bucket dictionary.

Step-by-step walkthrough

Let's trace through nums = [1, 5, 9, 1, 5, 9] with indexDiff = 2 and valueDiff = 3. The bucket width is w = 4.

Step 1: i=0, nums[0]=1. Bucket ID = 1 // 4 = 0

Bucket 0 is empty. No adjacent buckets to check. Insert 1 into bucket 0.

Step 2: i=1, nums[1]=5. Bucket ID = 5 // 4 = 1

Bucket 1 is empty. Check adjacent bucket 0: |5 - 1| = 4, which is greater than valueDiff=3. No match. Insert 5 into bucket 1.

Step 3: i=2, nums[2]=9. Bucket ID = 9 // 4 = 2. Remove i=0 (outside window).

Window slides: remove nums[0]=1 from bucket 0 (i - indexDiff = 0). Bucket 2 is empty. Check adjacent bucket 1: |9 - 5| = 4 > 3. No match. Insert 9.

Step 4: i=3, nums[3]=1. Bucket ID = 1 // 4 = 0. Remove i=1.

Remove nums[1]=5 from bucket 1. Bucket 0 is empty. No adjacent buckets with values. Insert 1 into bucket 0.

Step 5: i=4, nums[4]=5. Bucket ID = 5 // 4 = 1. Check adjacent bucket 0.

Remove nums[2]=9 from bucket 2. Bucket 1 is empty. Check adjacent bucket 0: |5 - 1| = 4 > 3. No match. Insert 5 into bucket 1. (Continue scanning; no duplicate found within constraints for this example.)

The bucket dictionary never holds more than indexDiff entries at any time. Each insertion, deletion, and lookup is O(1) in a hash map. That is what makes the entire algorithm O(n).

Drill this with spaced repetition

Complexity analysis

Metric	Value	Why
Time	O(n)	Single pass through the array. Each element involves O(1) bucket operations.
Space	O(min(n, k))	The bucket dictionary holds at most `indexDiff` entries at any time.

This is a significant improvement over the O(n log k) sorted container approach and the O(n * k) brute force.

Building blocks

This problem combines three reusable techniques:

1. Bucket sort mapping

The trick of dividing values by a bucket width to group nearby values together. This same idea appears in Maximum Gap and other problems where you need to reason about value ranges rather than exact matches.

bucket_id = num // w

2. Sliding window with a dictionary

Instead of a set or list, you maintain a dictionary keyed by bucket ID. The window slides forward by adding the new element's bucket and removing the oldest element's bucket. This is the same sliding window pattern you see in substring problems, but applied to a hash map instead of a frequency counter.

buckets[bucket_id] = num
if i >= indexDiff:
    del buckets[old_bucket_id]

3. Adjacent bucket checking

After placing a value in its bucket, you check the two neighboring buckets for potential matches. This "check neighbors" pattern shows up in grid problems, interval problems, and anywhere you partition a space into regions.

if bucket_id - 1 in buckets and abs(num - buckets[bucket_id - 1]) <= valueDiff:
    return True

Edge cases

Watch out for these in interviews:

valueDiff = 0: the bucket width becomes 1, and you are effectively looking for exact duplicates within a window. The algorithm still works correctly since same-bucket means same value.
Negative numbers: floor division in Python rounds toward negative infinity, which shifts bucket boundaries. The (num + 1) // w - 1 formula corrects for this so that values near zero end up in the right buckets.
indexDiff = 0: you can never have two distinct indices with abs(i - j) equal to 0 that are actually different indices. The window is empty, so the function returns False. (Actually, indexDiff >= 1 is guaranteed by the constraints, but it is good to think about.)
Single element array: with fewer than 2 elements, you cannot find a pair. The loop simply ends without returning True.

The trickiest part of this problem is handling negative numbers. If you forget the adjustment, values like -1 and 0 might end up in the wrong buckets. Test your solution with negative inputs to catch this.

From understanding to recall

Reading through the bucket sort trick once is not enough. A week from now, if you see a problem that requires grouping values into ranges with a sliding window, will the bucket ID formula come to mind automatically?

That is the gap between understanding and recall. You understand the approach right now, but recalling it under pressure requires repetition. Spaced repetition drills the bucket mapping, the adjacent check, and the window eviction as separate building blocks. After a few review cycles, the pattern fires on its own when you see the right signals: "value range constraint" plus "index distance constraint."

Build recall with CodeBricks