Median of Two Sorted Arrays: Binary Search on Partitions
Median of Two Sorted Arrays is LeetCode problem 4, and it has a reputation. It is the first "Hard" many people encounter, and the O(log(min(m,n))) requirement scares off a lot of candidates who can otherwise handle binary search just fine.
The good news: once you understand the binary search partition technique, this problem clicks. You are not doing anything exotic. You are doing a binary search, but instead of searching for a value in an array, you are searching for the right place to cut two arrays so that their left halves combine into exactly the bottom half of the merged result.
Let's break that down.
The problem
Given two sorted arrays nums1 and nums2 of sizes m and n, return the median of the two arrays combined. The overall runtime complexity must be O(log(m+n)).
Example: nums1 = [1, 3], nums2 = [2]. If you merged them, you would get [1, 2, 3]. The median is 2.
Another example: nums1 = [1, 2], nums2 = [3, 4]. Merged: [1, 2, 3, 4]. The median is (2 + 3) / 2 = 2.5.
The brute force approach is obvious: merge the arrays and grab the middle element. That works in O(m+n) time and O(m+n) space. But the problem demands O(log(m+n)), which means we need binary search.
The key insight: partitions, not merging
Forget about actually merging the arrays. Instead, think about what the median means. The median splits the combined elements into two equal halves. Every element in the left half is less than or equal to every element in the right half.
If we have a total of m + n elements, the left half should contain (m + n + 1) // 2 elements. We do not need to know what those elements are in order. We just need to figure out how many come from each array.
Here is the idea: if we take i elements from nums1 and j elements from nums2, and i + j equals the size of the left half, then we have a valid partition. The question is: which value of i gives us a correct split?
A partition is "correct" when the largest element on the left side is less than or equal to the smallest element on the right side. In other words:
nums1[i-1] <= nums2[j](the last element we took from A is not bigger than the first element we skipped from B)nums2[j-1] <= nums1[i](the last element we took from B is not bigger than the first element we skipped from A)
If both conditions hold, we found the right cut. The median is the maximum of the left side (for odd totals) or the average of the max-left and min-right (for even totals).
The binary search partition concept is the core building block here. Once you internalize "binary search on the cut position," you will recognize it in other problems like finding the kth element of two sorted arrays and splitting arrays with constraints.
Why binary search works here
We are searching for the right value of i (how many elements to take from nums1). Since j is determined by i (because j = half - i), we only have one variable to search over.
If nums1[i-1] > nums2[j], we took too many from nums1. We need to move the cut left, so we decrease i. If nums2[j-1] > nums1[i], we took too few from nums1. We need to move the cut right, so we increase i.
This is a monotonic condition. There is exactly one "crossing point" where the partition becomes valid. Binary search finds it in O(log(min(m,n))) time by always searching on the shorter array.
Visual walkthrough
Let's work through A = [1, 3, 8, 9, 15] and B = [7, 11, 18, 19, 21, 25]. The combined length is 11, so the left half needs (11 + 1) // 2 = 6 elements. We binary search on A (the shorter array, length 5).
Step 1: partA=2, partB=4. A takes [1,3], B takes [7,11,18,19].
maxLeft = max(3, 19) = 19. minRight = min(8, 21) = 8. 19 > 8, so B's left side has values too large. Take more from A.
Step 2: partA=4, partB=2. A takes [1,3,8,9], B takes [7,11].
maxLeft = max(9, 11) = 11. minRight = min(15, 18) = 15. 11 <= 15. Valid partition! Median = max(9, 11) = 11.
Two steps. That is the power of binary search on partitions. Instead of merging 11 elements, we checked 2 partition positions and found the median.
The code
Here is the full Python solution for LeetCode 4, Median of Two Sorted Arrays:
def findMedianSortedArrays(nums1, nums2):
# Always binary search on the shorter array
if len(nums1) > len(nums2):
nums1, nums2 = nums2, nums1
m, n = len(nums1), len(nums2)
half = (m + n + 1) // 2
lo, hi = 0, m
while lo <= hi:
i = lo + (hi - lo) // 2 # partition index for nums1
j = half - i # partition index for nums2
# Edge values (use -inf / inf for out-of-bounds)
left1 = nums1[i - 1] if i > 0 else float('-inf')
right1 = nums1[i] if i < m else float('inf')
left2 = nums2[j - 1] if j > 0 else float('-inf')
right2 = nums2[j] if j < n else float('inf')
if left1 <= right2 and left2 <= right1:
# Valid partition found
if (m + n) % 2 == 1:
return max(left1, left2)
else:
return (max(left1, left2) + min(right1, right2)) / 2
elif left1 > right2:
# Took too many from nums1, move cut left
hi = i - 1
else:
# Took too few from nums1, move cut right
lo = i + 1
Let's walk through each piece:
- Swap to search the shorter array. We always binary search on the shorter array. This guarantees O(log(min(m,n))) and avoids
jgoing out of bounds. half = (m + n + 1) // 2is the number of elements in the left half. The+1handles both odd and even totals correctly.lo, hi = 0, msets the search range.ican range from 0 (take nothing from nums1) tom(take everything from nums1).- Edge values with -inf/inf. When
i = 0, there is nonums1[i-1], so we use negative infinity (it will never be the max). Wheni = m, there is nonums1[i], so we use positive infinity (it will never be the min). Same logic forj. - The partition check. If
left1 <= right2andleft2 <= right1, every element on the left is less than or equal to every element on the right. We found the answer. - Adjusting the search. If
left1 > right2, the last element we took from nums1 is too big, so we take fewer. Ifleft2 > right1, we need more from nums1.
Do not forget the swap at the top. If you binary search on the longer array, j = half - i can go negative when i is large, causing index errors. Always search the shorter one.
Why -inf and inf?
This trips people up. When i = 0, we are taking zero elements from nums1. That means the entire left half comes from nums2. There is no "last element from nums1 on the left side," so we set left1 = -inf. This makes the condition left1 <= right2 automatically true, which is correct because there is nothing from nums1 to violate it.
The same logic applies to right1 = inf when i = m (we took everything from nums1, so there is nothing on nums1's right side to be too small).
Think of it as sentinel values that always satisfy the partition condition for the side they represent.
Complexity analysis
Time: O(log(min(m,n))). We binary search over the shorter array. Each step cuts the search range in half. If the shorter array has k elements, we do at most log(k) iterations.
Space: O(1). A handful of variables. No extra arrays, no recursion.
This is optimal. You cannot find the median of two sorted arrays faster than O(log(min(m,n))) without merging them, and you cannot use less than O(1) extra space.
Common mistakes
1. Searching on the longer array. This causes j to go out of bounds. Always swap so you search the shorter one.
2. Getting the half calculation wrong. Use (m + n + 1) // 2, not (m + n) // 2. The +1 ensures the left half gets the extra element for odd totals, which simplifies the median calculation.
3. Forgetting edge cases with -inf/inf. When one array contributes zero elements to the left half, you need sentinel values or explicit bounds checks. Using -inf and inf is the cleanest approach.
4. Mixing up <= vs < in the partition check. The condition is left1 <= right2 and left2 <= right1. Equal values are fine because duplicates can appear on either side of the partition.
5. Off-by-one on the search range. lo starts at 0 (take nothing from nums1) and hi starts at m (take everything). This is different from standard binary search where hi = m - 1.
The Building Blocks
The median of two sorted arrays solution is built from one core building block:
Binary search on the answer. Instead of searching for a value in an array, you search over the space of possible answers (here, partition positions) and use a condition to decide which half to eliminate. This is the same technique used in:
- Koko Eating Bananas: binary search over possible eating speeds
- Capacity to Ship Packages: binary search over possible ship capacities
- Split Array Largest Sum: binary search over possible maximum subarray sums
- Find Minimum in Rotated Sorted Array: binary search with a modified comparison
The core loop is always the same. You define a search range, pick the midpoint, check a condition, and eliminate half. What changes is what you are searching over and what the condition looks like.
For this problem, the "answer" is the partition index i, and the condition is whether left1 <= right2 and left2 <= right1. Once you see it through that lens, it is just binary search with a fancier validity check.
Problems that use the same brick
Once you have binary search on partitions locked in, try these:
- Koko Eating Bananas: binary search on the answer space, same structure with a different feasibility check
- Find Minimum in Rotated Sorted Array: binary search where the "sorted" property is disrupted by rotation
- Search in Rotated Sorted Array: combining binary search with a rotation check
- Split Array Largest Sum: binary search on the answer with a greedy feasibility check
Why you will forget this (and how to fix it)
This problem has a lot of moving parts. The swap, the half calculation, the sentinel values, the partition check, the median extraction for odd vs even. You might understand it perfectly right now, but in three weeks you will sit down in an interview and hesitate. Was it (m + n + 1) // 2 or (m + n) // 2? Do I use lo <= hi or lo < hi? Which direction do I move when left1 > right2?
Every one of those micro-decisions is trivial in isolation. But under pressure, with six of them stacked on top of each other, doubt creeps in. You write the wrong sentinel value, get a wrong answer on the test case, and spend ten minutes debugging.
Spaced repetition fixes this. You type the solution from scratch today. Then tomorrow. Then in three days. Each repetition strengthens the muscle memory until the code flows out automatically. When you sit down in the interview, your fingers write left1 = nums1[i-1] if i > 0 else float('-inf') without thinking, and you save your brainpower for explaining why the approach works.
Related posts
- Binary Search: More Than Finding a Number - The full guide to binary search patterns, including left-bound/right-bound variants and searching on the answer space
- Binary Search (LeetCode 704): The Foundation - The basic binary search template that this problem builds on
Visual Learner? This problem relies on binary search — master the fundamentals with our Binary Search Explained visual guide.