Find All Duplicates in an Array: Index as Hash Map

May 19, 20265 min read

leetcodeproblemmediumarrays

LeetCode Find All Duplicates in an Array (problem 442) gives you an integer array of length n where 1 <= a[i] <= n. Each element appears either once or twice. Your job is to find all the elements that appear twice, and you need to do it in O(n) time without using extra space beyond the output list.

The constraint 1 <= a[i] <= n is not just a detail. It is the entire solution.

Each value maps to an index (value - 1). When you visit a value, negate the element at that index. If it is already negative, the value is a duplicate. Values 2 and 3 each appear twice.

Why this problem matters

Most duplicate-detection problems let you throw everything into a hash set and call it a day. This one explicitly asks for O(1) extra space. That rules out sets, maps, and sorting (which would need O(log n) stack space at minimum). You need to encode "I have seen this value before" directly inside the input array.

The critical observation: since every value falls between 1 and n, each value can serve as a valid index into the array (after subtracting 1). That means you can use the array itself as a hash map where the sign of each element acts as a visited flag.

The key insight

For each value v you encounter, look at position v - 1 in the array. If nums[v - 1] is positive, flip it to negative. This marks that you have seen value v. If nums[v - 1] is already negative, then you have seen v before, which means v is a duplicate.

This works because:

Values range from 1 to n, so v - 1 is always a valid index (0 to n - 1)
You never lose information, because the absolute value of any element still tells you what was originally there
Sign acts as a one-bit visited marker at each index

Practice this pattern with spaced repetition

The solution

def find_duplicates(nums):
    result = []

    for i in range(len(nums)):
        val = abs(nums[i])
        idx = val - 1

        if nums[idx] < 0:
            result.append(val)
        else:
            nums[idx] = -nums[idx]

    return result

Let's walk through each line:

val = abs(nums[i]): Take the absolute value because previous iterations may have negated this cell. The original value is what matters.
idx = val - 1: Map the value to an index. Value 4 maps to index 3, value 1 maps to index 0, and so on.
if nums[idx] < 0: If the element at the target index is already negative, we have visited this value before. It is a duplicate, so add it to the result.
else: nums[idx] = -nums[idx]: Otherwise, negate the element at the target index to mark this value as seen.

This technique only works when values are bounded by the array length. The constraint 1 <= a[i] <= n guarantees every value maps to a valid index. Without that guarantee, you could index out of bounds or collide with unrelated positions.

Drill the index negation pattern

Visual walkthrough

Let's trace through nums = [4, 3, 2, 7, 8, 2, 3, 1] step by step. Watch how each visit negates an element at the corresponding index, and how a second visit to the same index reveals a duplicate.

Start: [4, 3, 2, 7, 8, 2, 3, 1]

Visit nums[0] = 4. Go to index 4-1 = 3. nums[3] = 7 (positive), so negate it.

After marking index 3: nums[3] negated

Visit nums[1] = 3. Go to index 3-1 = 2. nums[2] = 2 (positive), so negate it.

After marking index 2: nums[2] negated

Visit nums[2] = |-2| = 2. Go to index 2-1 = 1. nums[1] = 3 (positive), so negate it.

After marking index 1: nums[1] negated

Visit nums[3] = |-7| = 7. Go to index 7-1 = 6. nums[6] = 3 (positive), so negate it.

After marking index 6: nums[6] negated

Visit nums[4] = 8. Go to index 8-1 = 7. nums[7] = 1 (positive), so negate it.

After marking index 7: nums[7] negated

Visit nums[5] = 2. Go to index 2-1 = 1. nums[1] = -3 (already negative). Duplicate found: 2!

Duplicate found! nums[1] was already negative

Visit nums[6] = |-3| = 3. Go to index 3-1 = 2. nums[2] = -2 (already negative). Duplicate found: 3!

Duplicate found! nums[2] was already negative

Visit nums[7] = |-1| = 1. Go to index 1-1 = 0. nums[0] = 4 (positive), negate it. Done. Result: [2, 3].

After processing all eight elements, we found two duplicates: 2 and 3. Each was detected the moment we tried to negate an index that was already negative.

Complexity analysis

Approach	Time	Space
Index negation	O(n)	O(1)

Time: One pass through the array. Each element is visited exactly once. Looking up and negating at the target index is O(1). Total: O(n).

Space: The result list holds at most n/2 elements, but the problem statement says the output does not count as extra space. The only additional storage is a few variables (val, idx, i). That is O(1) auxiliary space.

Edge cases

No duplicates: nums = [1, 2, 3, 4]. Every negation hits a positive cell. The result list stays empty.

All duplicates: nums = [2, 2, 1, 1]. Both 2 and 1 are detected on their second occurrence. Result: [2, 1].

Single element: nums = [1]. Only one element, so no duplicate is possible. Result: [].

Maximum duplicates: nums = [1, 1, 2, 2, 3, 3, ..., n/2, n/2]. Every value appears exactly twice. The algorithm detects each one on the second pass and returns n/2 results.

Already negative after earlier marking: When you reach index 2 and nums[2] has been negated by an earlier step, you take abs(nums[2]) to recover the original value. The absolute value call is what makes the entire scheme work, because it separates "what the original value was" from "whether this index has been visited."

Do not forget the abs() call when reading values. Without it, a negated cell would give you a negative index, causing incorrect behavior or an out-of-bounds error.

The building blocks

Index as hash map

When array values are bounded by the array size, you can use the values themselves as keys into the array. Instead of allocating a separate hash map, you encode presence or absence directly in the array, typically using sign, placement, or value modification. This pattern shows up in:

First Missing Positive (LeetCode 41) uses cyclic sort to place each value at its "home" index, then scans for the first gap.
Find the Duplicate Number (LeetCode 287) treats array values as linked list pointers and uses Floyd's cycle detection to find the duplicate.
Missing Number (LeetCode 268) uses XOR or index summation to identify which value is absent.

In-place marking

The sign-negation trick is a specific form of in-place marking. You modify the array to record state (visited or not visited) without allocating extra memory. The same idea appears whenever you need to track membership across a bounded set of values and the problem forbids extra space.

From understanding to recall

The algorithm is short, but recalling it under pressure requires practice. The core loop is four lines: take the absolute value, compute the index, check the sign, and either mark or record. Write it from scratch. Then do it again in two days. Then again next week.

The pattern to internalize is: "value maps to index, sign encodes visited." Once that principle is second nature, you can reconstruct the solution for this problem and adapt it to related problems without memorizing code line by line.

Find the Duplicate Number - Same family of problems where values map to indices, but this one uses Floyd's cycle detection instead of sign negation.
First Missing Positive - Uses cyclic sort (another index-as-hash technique) to solve the harder variant where you find what is missing rather than what is duplicated.
Missing Number - A simpler version where exactly one value is absent, solvable with XOR or Gauss summation.

Try CodeBricks free