Skip to content
← All posts

Kth Missing Positive Number: Binary Search on Missing Counts

5 min read
leetcodeproblemeasyarraysbinary-search

Kth Missing Positive Number is LeetCode 1539. You are given a sorted array of positive integers arr in strictly increasing order and an integer k. Return the kth positive integer that is missing from the array.

Example: arr = [2, 3, 4, 7, 11], k = 5. The missing positive integers are 1, 5, 6, 8, 9, 10, 12, ... The 5th missing one is 9.

Positive integers 1 through 12123456789101112miss #1in arrin arrin arrmiss #2miss #3in arrmiss #4miss #5miss #6in arrarr = [2, 3, 4, 7, 11], k = 5The 5th missing positive integer is 9.
Green cells are values present in the array. Red cells are missing positive integers. The blue cell is the kth missing value (k=5).

Why this problem matters

This is a great problem for learning binary search on a derived property rather than on the array values themselves. Instead of searching for a specific value in the array, you compute a "missing count" at each index and binary search on that. This technique shows up in many problems where you need to find a boundary defined by a computed quantity rather than a stored value. If you have worked through Koko Eating Bananas, you already know the "binary search on something other than an array element" pattern. Kth Missing Positive Number is a cleaner, simpler version of the same idea.

The key insight

For any index i, arr[i] - (i + 1) tells you how many positive integers are missing before arr[i]. Think about it: if no values were missing, the value at index i would be i + 1 (since the array starts from positive integers). The difference between the actual value and the expected value is exactly the number of values that got skipped.

Since the array is sorted and strictly increasing, this missing count is non-decreasing as you move right. That monotonic property is exactly what makes binary search possible. You can binary search for the first index where the missing count reaches or exceeds k.

The solution

def findKthPositive(arr, k):
    lo, hi = 0, len(arr) - 1

    while lo <= hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] - (mid + 1) < k:
            lo = mid + 1
        else:
            hi = mid - 1

    return lo + k

When the loop ends, lo is the number of array elements that come before the kth missing value. Each of those elements "pushes" the answer up by one position from where it would be if the array were empty. So the answer is simply k + lo. If no array elements come before the kth missing value (e.g., if k = 1 and arr[0] = 5), then lo = 0 and the answer is k itself.

The "binary search on a derived monotonic property" pattern is powerful. Whenever you can compute a non-decreasing quantity from array indices and need to find where that quantity crosses a threshold, binary search applies, even if you are not searching for a value that exists in the array.

Visual walkthrough

Setup: arr = [2, 3, 4, 7, 11], k = 5

For each index i, the missing count is arr[i] - (i + 1). Index 0: 2 - 1 = 1. Index 1: 3 - 2 = 1. Index 2: 4 - 3 = 1. Index 3: 7 - 4 = 3. Index 4: 11 - 5 = 6. We binary search for the first index where the missing count is at least k = 5.

Step 1: lo=0, hi=4, mid=2. arr[2]=4, missing = 4 - 3 = 1. Since 1 < 5, the kth missing is after index 2.

i=0arr[0]=2miss=1i=1arr[1]=3miss=1i=2arr[2]=4miss=1i=3arr[3]=7miss=3i=4arr[4]=11miss=6lomidhi

Missing count at mid is 1, which is less than k=5. Move lo = mid + 1 = 3.

Step 2: lo=3, hi=4, mid=3. arr[3]=7, missing = 7 - 4 = 3. Since 3 < 5, the kth missing is after index 3.

i=0arr[0]=2miss=1i=1arr[1]=3miss=1i=2arr[2]=4miss=1i=3arr[3]=7miss=3i=4arr[4]=11miss=6lomidhi

Missing count at mid is 3, still less than k=5. Move lo = mid + 1 = 4.

Step 3: lo=4, hi=4, mid=4. arr[4]=11, missing = 11 - 5 = 6. Since 6 >= 5, the kth missing is before or at index 4.

i=0arr[0]=2miss=1i=1arr[1]=3miss=1i=2arr[2]=4miss=1i=3arr[3]=7miss=3i=4arr[4]=11miss=6lomidhi

lo == hi, search complete. lo = 4. The answer is k + lo = 5 + 4 = 9.

Result

The binary search converges at lo = 4. The formula k + lo = 5 + 4 = 9 gives us the 5th missing positive integer. Why does this work? At the end, lo equals the number of array elements that come before the answer. Those elements each "shift" the answer up by one, so the kth missing value is k + lo.

Three steps to find the answer on a 5-element array. Binary search makes this O(log n) regardless of the size of the array or the value of k.

Complexity analysis

ApproachTimeSpace
Binary searchO(log n)O(1)

Time: O(log n). The binary search halves the range each iteration. For an array of length n, this takes at most log2(n) steps. The missing count at each index is computed in O(1) with a single subtraction.

Space: O(1). Only a few variables (lo, hi, mid) are used. No auxiliary data structures.

You could also solve this in O(n) by iterating through the array and counting missing values as you go, but the binary search approach is strictly better for large inputs.

The building blocks

1. Missing count formula: arr[i] - (i + 1)

missing_before = arr[mid] - (mid + 1)

This single expression is the heart of the solution. In a "perfect" sequence with no gaps, position i would hold value i + 1. The difference between the actual value and the expected value counts exactly how many values were skipped. You do not need to enumerate the missing values or store them anywhere.

2. Binary search on a derived property

while lo <= hi:
    mid = lo + (hi - lo) // 2
    if arr[mid] - (mid + 1) < k:
        lo = mid + 1
    else:
        hi = mid - 1

This is the standard binary search template, but instead of comparing arr[mid] to a target, you compare a computed quantity (the missing count) to k. When the missing count at mid is less than k, the kth missing value must be to the right, so you move lo up. Otherwise, you move hi down. The result is that lo ends up at the first index where the missing count is at least k.

Edge cases

  • All missing values come before the array starts: arr = [5, 6, 7], k = 3. Every value from 1 to 4 is missing. The 3rd missing value is 3. The binary search never moves lo past 0, so the answer is 0 + 3 = 3.
  • All missing values come after the array ends: arr = [1, 2, 3, 4], k = 2. No values are missing within the array. The binary search pushes lo to len(arr) = 4, giving 4 + 2 = 6.
  • k = 1: the smallest missing positive integer. If arr[0] > 1, the answer is 1. Otherwise the binary search finds the first gap.
  • Single element array: arr = [2], k = 1. Missing count at index 0 is 2 - 1 = 1, which equals k. So hi moves to -1, lo stays at 0, and the answer is 0 + 1 = 1.
  • Large gaps: arr = [1, 1000000], k = 999998. The missing count at index 1 is 1000000 - 2 = 999998. The binary search handles this in two steps regardless of the gap size.

From understanding to recall

The logic here is compact: compute missing counts with a subtraction, binary search for the threshold, return lo + k. You probably see why it works right now. But under interview pressure, the details matter. Is the comparison < k or <= k? Do you return lo + k or lo + k - 1? Do you initialize hi to len(arr) - 1 or len(arr)?

Spaced repetition locks in these details. You write the solution from scratch today, again tomorrow, and again in a few days. After a few rounds, the formula and the boundary conditions are automatic. You spend your interview time thinking about the problem, not second-guessing the template.

Related posts

CodeBricks breaks problems like Kth Missing Positive Number into atomic building blocks and schedules them for spaced repetition. Instead of re-solving the whole problem each time, you drill the missing count formula and the binary search template separately until both are automatic. Then you combine them effortlessly.