Find Unique Binary String: Cantor's Diagonal Trick
LeetCode 1980, Find Unique Binary String, asks you to find any binary string of length n that does not appear in a given array of n unique binary strings, each of length n.
For example, given nums = ["01", "10"], the answer could be "11", "00", or any other 2-bit string not already in the array.
The brute-force temptation is to enumerate all 2^n possible binary strings and check which one is missing. That works, but there is a much cleaner O(n) approach hiding in plain sight: Cantor's diagonalization.
The approach: Cantor's diagonalization
The key insight comes from a technique Georg Cantor used in 1891 to prove that the real numbers are uncountable. The idea is surprisingly simple when applied here:
- Look at
nums[0][0](the first character of the first string). Flip it. - Look at
nums[1][1](the second character of the second string). Flip it. - Continue for every
i: takenums[i][i]and flip it. - Concatenate all the flipped bits. The resulting string is your answer.
Why does this always work? The result string differs from nums[0] at position 0, from nums[1] at position 1, from nums[2] at position 2, and so on. It is guaranteed to differ from every string in the input at (at least) one position. So it cannot be equal to any of them.
You do not need to check whether the result is already in the array. The diagonal construction guarantees it is not. That is the beauty of this technique.
The code
def findDifferentBinaryString(nums):
result = []
for i in range(len(nums)):
result.append("0" if nums[i][i] == "1" else "1")
return "".join(result)
Walk through what happens:
- Loop through indices
0ton - 1. - At each index
i, read the diagonal characternums[i][i]. - Flip it: if it is
"1", append"0". If it is"0", append"1". - Join all the flipped characters into a single string and return it.
That is the entire solution. No sets, no recursion, no backtracking needed.
Visual walkthrough
Let's trace through nums = ["01", "10"] step by step, watching how the diagonal bits get flipped to produce a string that cannot appear in the input.
Step 1: Look at nums[0][0]
nums[0] = "01". The character at index 0 is "0". Flip it to "1". Our result starts with "1".
Step 2: Look at nums[1][1]
nums[1] = "10". The character at index 1 is "0". Flip it to "1". Our result is now "11".
Result: "11"
"11" differs from "01" at position 0 and from "10" at position 1. It cannot match any input string.
Each step picks one diagonal cell and flips its value. By the end, the result string differs from every input string at the position along the diagonal. This is what makes the proof airtight.
Complexity analysis
| Metric | Value |
|---|---|
| Time | O(n), one pass through the diagonal |
| Space | O(n), for storing the result string |
You touch exactly n characters (one per input string), and each operation is O(1). No sorting, no hashing, no search.
Building blocks
This problem is built on one elegant mathematical idea and one common string technique.
Cantor's diagonalization
The diagonal argument is a proof technique from set theory. In this problem, it gives you a constructive way to build a missing element. Whenever you have n items and need to find one that is different from all of them, consider whether you can "disagree" with each item at a distinct position.
You will see diagonal-style reasoning in problems that involve:
- Subsets (LeetCode 78), where each element has a binary include/exclude decision.
- Letter Combinations of a Phone Number (LeetCode 17), where you construct strings character by character.
Bit flipping
Flipping a binary character is a micro-operation, but it is the engine that powers the guarantee. The flip ensures that your constructed string cannot match the source string at the chosen position. This same "disagree at one position" idea appears in problems involving Hamming distance and error-correcting codes.
Edge cases
n = 1. The input contains one string, either "0" or "1". The diagonal approach reads nums[0][0] and flips it, returning the other option. This always works because there are two possible 1-bit strings and only one is taken.
All strings are identical except the diagonal. No special handling needed. The diagonal approach does not care about the overall structure of the strings. It only looks at the diagonal cells.
Maximum n = 16. The input contains 16 strings of length 16. There are 65,536 possible strings but only 16 are given. The diagonal trick still produces one valid answer in O(16) time.
Strings that are already sorted or follow a pattern. The algorithm is agnostic to ordering. It never compares strings against each other; it only reads one character per string.
From understanding to recall
The diagonal trick is easy to follow when you see it explained, but it is also easy to forget under pressure. The detail that trips people up is remembering that you read nums[i][i], not nums[i][0] or some other fixed column. The "diagonal" part is the entire insight.
Spaced repetition helps you internalize the pattern so it becomes automatic. You drill the idea of "disagree with string i at position i" a few times at increasing intervals, and it sticks. The next time you see a problem about finding a missing element from a structured set, the diagonal approach will surface on its own.
Related posts
- Subsets - Another problem where binary choices (include or exclude) generate all possibilities
- Letter Combinations of a Phone Number - Building strings character by character through structured enumeration
- Splitting a String Into Descending Consecutive Values - A medium string and backtracking problem exploring string partitions