Skip to content
← All posts

First Bad Version: Binary Search at Its Simplest

5 min read
leetcodeproblemeasybinary-search

First Bad Version is LeetCode 278. You are a product manager and you have n versions of a product. One version introduced a bug, and every version after it is also bad. You are given an API isBadVersion(version) that returns whether a version is bad. Find the first bad version while minimizing the number of API calls.

This is binary search distilled to its purest form. There is no array to index into, no target value to match. You just have a boolean predicate that flips from false to true at some unknown boundary, and you need to find that boundary.

1good2good3good4bad5bad6bad7bad8badfirst bad
Versions 1-8. Versions 1-3 are good (green), versions 4-8 are bad (orange). We need to find version 4, the first bad one.

Why this problem matters

First Bad Version is the cleanest example of binary search on a boolean predicate. You are not searching for a specific value in a sorted array. You are searching for the point where a condition changes from false to true. Once you internalize this pattern, you can apply it to dozens of harder problems: finding minimum capacity for shipping packages, minimum speed for eating bananas, or the smallest valid answer in any "search on answer" problem.

The template you build here is the same template that solves Koko Eating Bananas, Split Array Largest Sum, and many other medium and hard problems. Get this one down cold and those become much easier.

The approach: bisect the search space

Think of it this way: all good versions are on the left, all bad versions are on the right. There is one boundary point where the status flips from good to bad. Binary search finds that boundary by repeatedly checking the middle of the remaining range.

If the middle version is bad, the first bad version is at mid or somewhere to the left, so you shrink the right boundary to mid. If the middle version is good, the first bad version must be to the right, so you move the left boundary to mid + 1.

def first_bad_version(n):
    left, right = 1, n
    while left < right:
        mid = left + (right - left) // 2
        if is_bad_version(mid):
            right = mid
        else:
            left = mid + 1
    return left

Step-by-step walkthrough

Let's trace through an example with 8 versions where version 4 is the first bad one.

Step 1: left=1, right=8, mid=4. isBadVersion(4) = true. The first bad version is at mid or to the left.

1122334455667788leftmidright

isBadVersion(4) returns true. Set right = mid = 4. The answer could be version 4 itself, so we keep it in the search space.

Step 2: left=1, right=4, mid=2. isBadVersion(2) = false. The first bad version must be to the right.

1122334455667788leftmidright

isBadVersion(2) returns false. Set left = mid + 1 = 3. Versions 1-2 are confirmed good, eliminate them.

Step 3: left=3, right=4, mid=3. isBadVersion(3) = false. Move left past this good version.

1122334455667788leftmidright

isBadVersion(3) returns false. Set left = mid + 1 = 4. Version 3 is good, so the first bad version must be further right.

Step 4: left=4, right=4. Converged. The first bad version is 4.

1122334455667788leftmidright

left == right, search complete. Return version 4.

Notice how each step eliminates roughly half the remaining versions. After just 3 comparisons (plus the final convergence check), we found the answer among 8 versions. For a million versions, you would need at most 20 API calls.

Why left < right instead of left <= right?

This is one of the subtlest points in binary search, and it trips people up constantly.

In the classic binary search for a target value, you use left <= right because you need to check the last remaining element before concluding the target does not exist. The loop exits when left > right, meaning an empty search space.

Here, you use left < right because you are not looking for a miss case. You know the first bad version exists. When left == right, both pointers point to the same version, which must be the answer. There is no need to check it further.

The two templates also differ in how they update pointers. With left < right, when isBadVersion(mid) is true, you set right = mid (not right = mid - 1) because mid itself might be the answer. This keeps the answer inside the [left, right] range at all times. The loop terminates when the range shrinks to a single element.

If you used left <= right with right = mid, you would get an infinite loop when left == right == mid.

Complexity analysis

MetricValue
TimeO(log n)
SpaceO(1)
API callsO(log n)

Each iteration halves the search space, so the number of iterations is at most log2(n). No extra memory is needed beyond the three pointer variables.

The building blocks

This problem is built from one core building block: binary search on a boolean predicate.

The predicate here is isBadVersion(version). It returns false for all versions before the boundary and true for all versions at or after the boundary. Binary search finds the boundary in O(log n) steps.

This same structure appears whenever you have a monotonic condition. If you can phrase a problem as "find the smallest x such that condition(x) is true," you can apply this exact template. The condition might be "can we ship all packages within days days at capacity x?" or "can Koko eat all bananas in h hours at speed x?" The binary search mechanics are identical. Only the predicate changes.

Edge cases to watch for

  • n = 1 (the only version is bad)
  • First version is bad
  • Last version is the first bad one
  • Integer overflow in mid calculation (use left + (right - left) // 2)

The overflow case deserves emphasis. Writing (left + right) // 2 can overflow in languages like Java or C++ when left and right are both large. The formula left + (right - left) // 2 avoids this by never computing a sum larger than right. In Python this is not an issue due to arbitrary precision integers, but building the habit protects you in interviews where you might be coding in Java.

From understanding to recall

You understand the logic. You can trace through the example. But when the interviewer says "LeetCode 278, go," can you write the solution in under a minute without referencing anything? That is what separates understanding from recall.

The differences between this template and the classic binary search template are small but critical: left < right vs left <= right, right = mid vs right = mid - 1. Under pressure, these small details become sources of doubt. Spaced repetition eliminates that doubt by making the correct template automatic.

Related posts