Skip to content
← All posts

Check If a String Contains All Binary Codes of Size K: Sliding Window + HashSet

6 min read
leetcodeproblemmediumstringshash-mapbit-manipulation

LeetCode 1461, Check If a String Contains All Binary Codes of Size K, gives you a binary string s and an integer k. Your job is to return true if every binary code of length k appears as a substring of s, and false otherwise.

For example, given s = "00110110" and k = 2, you need to check whether all binary codes of length 2 ("00", "01", "10", "11") appear somewhere in the string. They do, so the answer is true.

This problem looks like it could be tricky, but the solution falls right out once you count how many distinct codes exist and realize a sliding window can extract them all.

Check out the problem on LeetCode.

0123456700110110window (k=2)"11"seen set000110114 / 4 codesAll 4 codes found. Return true.
Slide a window of size k=2 across "00110110". Each position extracts a binary substring, which gets added to a set. If the set reaches 2^k = 4 entries, every binary code of length k exists in the string.

Why this problem matters

This problem sits at the intersection of three patterns: sliding windows, hash sets, and binary enumeration. You need to recognize that "every binary code of length k" is just 2^k distinct strings, and that a sliding window of size k naturally generates every substring of that length.

It is also a great exercise in early termination. Once your set reaches size 2^k, you can stop immediately. There is no need to scan the rest of the string. This kind of optimization comes up constantly in interview problems where the brute force is correct but wasteful.

Finally, this problem opens the door to a bit manipulation optimization. Instead of storing substrings as strings, you can maintain a rolling integer and use bitwise operations to track the window. That shifts the per-step cost from O(k) to O(1), which matters when k is large.

The key insight

There are exactly 2^k binary codes of length k. For k = 2, that is 4 codes: "00", "01", "10", "11". For k = 3, that is 8 codes: "000" through "111".

Slide a window of size k across s, extract the substring at each position, and add it to a set. If the set reaches size 2^k, you know every code is present and you can return true. If you reach the end of the string without hitting that count, return false.

The window produces n - k + 1 substrings total (where n is the length of s). For the answer to be true, you need at least 2^k distinct ones. If n - k + 1 is less than 2^k, you can return false immediately without scanning at all.

The solution

def has_all_codes(s, k):
    need = 1 << k
    if len(s) - k + 1 < need:
        return False

    seen = set()
    for i in range(len(s) - k + 1):
        seen.add(s[i:i + k])
        if len(seen) == need:
            return True

    return False

The function starts by computing need = 2^k using the bit shift 1 << k. Then it does a quick length check: if the string does not have enough positions to even produce 2^k substrings, return false right away.

The main loop slides a window of size k across the string. At each position i, it extracts the substring s[i:i+k] and adds it to the set. Python's set handles deduplication automatically, so duplicates are ignored.

The early exit check if len(seen) == need lets us return true as soon as we have collected all codes. Without this, the solution would still be correct, but it would do unnecessary work on strings where all codes appear early.

For large values of k, string slicing at every step costs O(k) per operation. You can bring that down to O(1) per step by using a rolling hash or bit manipulation. Maintain an integer that represents the current window: shift it left by 1, OR in the new bit, and mask off the high bit. This avoids creating a new string at every position.

Visual walkthrough

Let's trace through s = "00110110" with k = 2. At each step, the window extracts a 2-character substring and adds it to the set. We need 4 distinct codes to return true.

Step 1: Window at index 0. Extract "00". Add to set.

00110110"00"seen (1/4)00

Set has 1 of 4 required codes.

Step 2: Window at index 1. Extract "01". Add to set.

00110110"01"seen (2/4)0001

Set has 2 of 4 required codes.

Step 3: Window at index 2. Extract "11". Add to set.

00110110"11"seen (3/4)000111

Set has 3 of 4 required codes.

Step 4: Window at index 3. Extract "10". Add to set.

00110110"10"seen (4/4)00011110

Set has 4 of 4 required codes. All binary codes of length 2 found!

Step 5: Set is full. We can stop early and return true.

00110110"01"seen (4/4)00011110

Remaining windows would add duplicates. The answer is true.

By step 4, the set contains all four binary codes of length 2. We can return true without examining the remaining positions. The key observation is that each window position gives us exactly one substring, and the set grows until it either reaches 2^k or we run out of positions.

Complexity analysis

ApproachTimeSpace
Sliding Window + HashSetO(n * k)O(2^k * k)
Rolling HashO(n)O(2^k)

Sliding Window + HashSet: The loop runs n - k + 1 times. At each step, extracting the substring costs O(k) (creating a new string of length k). Adding it to the set also involves hashing the string, which costs O(k). So the total time is O(n * k). The set holds at most 2^k strings, each of length k, so the space is O(2^k * k).

Rolling Hash / Bit Manipulation: By maintaining the window as an integer, each step costs O(1) for the shift, OR, and mask operations. The loop still runs O(n) times, but each iteration is constant work. The set stores integers instead of strings, so space drops to O(2^k). This is the preferred approach when k can be up to 20 or more.

The building blocks

1. Sliding window substring extraction

The core mechanic is sliding a fixed-size window across the string and extracting the substring at each position. This is the simplest form of sliding window: no expansion or shrinking, just a fixed-width frame moving one step at a time.

for i in range(len(s) - k + 1):
    window = s[i:i + k]

This produces every contiguous substring of length k. The number of such substrings is n - k + 1. This same extraction pattern appears in problems like Find All Anagrams, Minimum Window Substring (for fixed-length checks), and any problem that asks about substrings of a specific length.

2. Set-based completeness check

Once you know the total number of distinct values you need, a set gives you an efficient way to track progress. Add each new value, and check the set size against the target.

seen = set()
for item in items:
    seen.add(item)
    if len(seen) == target:
        return True
return False

This pattern of "collect into a set and check completeness" shows up in many problems. The set handles deduplication for free, and checking its size is O(1). The early exit avoids unnecessary work once the goal is met.

Edge cases

  • k larger than len(s): No window of size k fits, so no substrings exist. Return false.
  • k = 1: Only two codes ("0" and "1"). Return true if the string contains both characters.
  • 2^k exceeds n - k + 1: Not enough positions to produce all codes. Return false immediately (the length check handles this).
  • String is all zeros: Only the code "000...0" appears. Return false for any k > 0 (except when k = 0, but the problem guarantees k >= 1).
  • String is very long: The early exit ensures you stop as soon as all codes are found, so long strings with early coverage are fast.

From understanding to recall

Reading this solution and understanding the logic is the first step. The harder part is being able to reproduce it under pressure. The key relationships to internalize are: 2^k is the target count, the sliding window generates all substrings of length k, and the set tracks uniqueness.

Spaced repetition helps here because the individual building blocks (fixed-size sliding window, set-based completeness check) are reusable across many problems. Once you can produce them from memory, combining them for this problem becomes a matter of assembly rather than invention. You are not memorizing one solution. You are drilling two patterns that snap together.

Related posts

The gap between understanding a solution and producing it on demand is where most interview prep falls short. Spaced repetition closes that gap by revisiting the sliding window extraction and set completeness patterns at increasing intervals, until they are automatic rather than something you need to rederive each time.