Check If a String Contains All Binary Codes of Size K: Sliding Window + HashSet
LeetCode 1461, Check If a String Contains All Binary Codes of Size K, gives you a binary string s and an integer k. Your job is to return true if every binary code of length k appears as a substring of s, and false otherwise.
For example, given s = "00110110" and k = 2, you need to check whether all binary codes of length 2 ("00", "01", "10", "11") appear somewhere in the string. They do, so the answer is true.
This problem looks like it could be tricky, but the solution falls right out once you count how many distinct codes exist and realize a sliding window can extract them all.
Check out the problem on LeetCode.
Why this problem matters
This problem sits at the intersection of three patterns: sliding windows, hash sets, and binary enumeration. You need to recognize that "every binary code of length k" is just 2^k distinct strings, and that a sliding window of size k naturally generates every substring of that length.
It is also a great exercise in early termination. Once your set reaches size 2^k, you can stop immediately. There is no need to scan the rest of the string. This kind of optimization comes up constantly in interview problems where the brute force is correct but wasteful.
Finally, this problem opens the door to a bit manipulation optimization. Instead of storing substrings as strings, you can maintain a rolling integer and use bitwise operations to track the window. That shifts the per-step cost from O(k) to O(1), which matters when k is large.
The key insight
There are exactly 2^k binary codes of length k. For k = 2, that is 4 codes: "00", "01", "10", "11". For k = 3, that is 8 codes: "000" through "111".
Slide a window of size k across s, extract the substring at each position, and add it to a set. If the set reaches size 2^k, you know every code is present and you can return true. If you reach the end of the string without hitting that count, return false.
The window produces n - k + 1 substrings total (where n is the length of s). For the answer to be true, you need at least 2^k distinct ones. If n - k + 1 is less than 2^k, you can return false immediately without scanning at all.
The solution
def has_all_codes(s, k):
need = 1 << k
if len(s) - k + 1 < need:
return False
seen = set()
for i in range(len(s) - k + 1):
seen.add(s[i:i + k])
if len(seen) == need:
return True
return False
The function starts by computing need = 2^k using the bit shift 1 << k. Then it does a quick length check: if the string does not have enough positions to even produce 2^k substrings, return false right away.
The main loop slides a window of size k across the string. At each position i, it extracts the substring s[i:i+k] and adds it to the set. Python's set handles deduplication automatically, so duplicates are ignored.
The early exit check if len(seen) == need lets us return true as soon as we have collected all codes. Without this, the solution would still be correct, but it would do unnecessary work on strings where all codes appear early.
For large values of k, string slicing at every step costs O(k) per operation. You can bring that down to O(1) per step by using a rolling hash or bit manipulation. Maintain an integer that represents the current window: shift it left by 1, OR in the new bit, and mask off the high bit. This avoids creating a new string at every position.
Visual walkthrough
Let's trace through s = "00110110" with k = 2. At each step, the window extracts a 2-character substring and adds it to the set. We need 4 distinct codes to return true.
Step 1: Window at index 0. Extract "00". Add to set.
Set has 1 of 4 required codes.
Step 2: Window at index 1. Extract "01". Add to set.
Set has 2 of 4 required codes.
Step 3: Window at index 2. Extract "11". Add to set.
Set has 3 of 4 required codes.
Step 4: Window at index 3. Extract "10". Add to set.
Set has 4 of 4 required codes. All binary codes of length 2 found!
Step 5: Set is full. We can stop early and return true.
Remaining windows would add duplicates. The answer is true.
By step 4, the set contains all four binary codes of length 2. We can return true without examining the remaining positions. The key observation is that each window position gives us exactly one substring, and the set grows until it either reaches 2^k or we run out of positions.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Sliding Window + HashSet | O(n * k) | O(2^k * k) |
| Rolling Hash | O(n) | O(2^k) |
Sliding Window + HashSet: The loop runs n - k + 1 times. At each step, extracting the substring costs O(k) (creating a new string of length k). Adding it to the set also involves hashing the string, which costs O(k). So the total time is O(n * k). The set holds at most 2^k strings, each of length k, so the space is O(2^k * k).
Rolling Hash / Bit Manipulation: By maintaining the window as an integer, each step costs O(1) for the shift, OR, and mask operations. The loop still runs O(n) times, but each iteration is constant work. The set stores integers instead of strings, so space drops to O(2^k). This is the preferred approach when k can be up to 20 or more.
The building blocks
1. Sliding window substring extraction
The core mechanic is sliding a fixed-size window across the string and extracting the substring at each position. This is the simplest form of sliding window: no expansion or shrinking, just a fixed-width frame moving one step at a time.
for i in range(len(s) - k + 1):
window = s[i:i + k]
This produces every contiguous substring of length k. The number of such substrings is n - k + 1. This same extraction pattern appears in problems like Find All Anagrams, Minimum Window Substring (for fixed-length checks), and any problem that asks about substrings of a specific length.
2. Set-based completeness check
Once you know the total number of distinct values you need, a set gives you an efficient way to track progress. Add each new value, and check the set size against the target.
seen = set()
for item in items:
seen.add(item)
if len(seen) == target:
return True
return False
This pattern of "collect into a set and check completeness" shows up in many problems. The set handles deduplication for free, and checking its size is O(1). The early exit avoids unnecessary work once the goal is met.
Edge cases
klarger thanlen(s): No window of sizekfits, so no substrings exist. Returnfalse.k = 1: Only two codes ("0"and"1"). Returntrueif the string contains both characters.2^kexceedsn - k + 1: Not enough positions to produce all codes. Returnfalseimmediately (the length check handles this).- String is all zeros: Only the code
"000...0"appears. Returnfalsefor anyk > 0(except whenk = 0, but the problem guaranteesk >= 1). - String is very long: The early exit ensures you stop as soon as all codes are found, so long strings with early coverage are fast.
From understanding to recall
Reading this solution and understanding the logic is the first step. The harder part is being able to reproduce it under pressure. The key relationships to internalize are: 2^k is the target count, the sliding window generates all substrings of length k, and the set tracks uniqueness.
Spaced repetition helps here because the individual building blocks (fixed-size sliding window, set-based completeness check) are reusable across many problems. Once you can produce them from memory, combining them for this problem becomes a matter of assembly rather than invention. You are not memorizing one solution. You are drilling two patterns that snap together.
Related posts
- Longest Substring Without Repeating Characters - Another sliding window + set problem, but with a variable-size window
- Find All Anagrams in a String - Fixed-size sliding window with frequency counting instead of a set
- Permutation in String - Fixed-size window checking for a specific character distribution
The gap between understanding a solution and producing it on demand is where most interview prep falls short. Spaced repetition closes that gap by revisiting the sliding window extraction and set completeness patterns at increasing intervals, until they are automatic rather than something you need to rederive each time.