Skip to content
← All posts

Minimum Window Substring: Advanced Sliding Window

9 min read
leetcodeproblemhardsliding-window

LeetCode's Minimum Window Substring (LeetCode 76) is one of the hardest sliding window problems you will encounter. It combines the expand-and-shrink window mechanics you already know with frequency counting and a clever validity trick that keeps everything running in O(n).

Given two strings s and t, find the minimum window substring of s that contains every character of t (including duplicates). If no such window exists, return an empty string.

For example, given s = "ADOBECODEBANC" and t = "ABC", the answer is "BANC".

If you have worked through the sliding window pattern and Longest Substring Without Repeating Characters, you already have the structural foundation. This problem layers on top of those fundamentals with one big addition: instead of tracking a set, you are tracking character frequencies and asking "does my window contain at least the required count of every character in t?"

A0D1O2B3E4C5O6D7E8B9A10N11C12leftrightA:1/1 B:1/1 C:1/1
Window "ADOBEC" contains all of A, B, C. Now we shrink from the left to find a shorter valid window.

Why this is a sliding window hard problem

The brute force approach checks every possible substring of s and verifies whether it contains all characters of t. That means two nested loops to pick the substring plus a frequency scan to check validity. For a string of length n, that is O(n^2) substrings, each taking O(n) to verify. Total: O(n^3).

def min_window_brute(s, t):
    from collections import Counter
    t_count = Counter(t)
    best = ""

    for i in range(len(s)):
        for j in range(i + len(t), len(s) + 1):
            window = Counter(s[i:j])
            if all(window[c] >= t_count[c] for c in t_count):
                if best == "" or j - i < len(best):
                    best = s[i:j]
    return best

This works but is obviously way too slow. The sliding window brings it down to O(n) by doing two things: (1) never backtracking the pointers, and (2) updating the frequency counts incrementally instead of rebuilding them from scratch.

The tricky part is checking window validity efficiently. Naively, you would loop through every character in t on each step and check if your window has enough. That inner loop makes things slower than they need to be. The "formed" counter trick eliminates it entirely.

The "formed" counter trick

Here is the core idea. You maintain two things:

  • window_counts: a dict mapping each character in the current window to its count.
  • formed: an integer that tracks how many unique characters from t have reached their required frequency in the window.

When formed equals the number of unique characters in t, the window is valid. You do not need to scan through every character to check. You just compare two integers.

How does formed stay accurate? Every time you add a character to the window, check if it is a character from t and if its window count just hit the required count. If so, increment formed. When you remove a character from the window and its count drops below the required count, decrement formed.

This is the key insight that makes the minimum window substring solvable in O(n). Each character addition or removal does O(1) work to maintain formed, and the validity check is a single comparison.

The formed counter is a general technique. Anytime you need to check if multiple conditions are all satisfied, track the count of satisfied conditions instead of rechecking each one. This same idea appears in Permutation in String and Find All Anagrams in a String.

Walking through it step by step

Let's trace through s = "ADOBECODEBANC" with t = "ABC". We need A:1, B:1, C:1. The formed counter starts at 0 and needs to reach 3 (three unique characters in t).

Step 1: Expand right to index 0 ('A'). A matches a needed char.

ADOBECODEBANCleftrightA:1/1 B:0/1 C:0/1formed = 1 / 3

Have A (1/1). Still missing B and C.

Step 2: Expand to index 3 ('B'). Found B.

ADOBECODEBANCleftrightA:1/1 B:1/1 C:0/1formed = 2 / 3

Have A and B. Still need C.

Step 3: Expand to index 5 ('C'). All characters found! Window = 'ADOBEC', length 6.

ADOBECODEBANCleftrightA:1/1 B:1/1 C:1/1formed = 3 / 3

formed == required! Record best = 6. Now shrink.

Step 4: Shrink left. Remove 'A' at index 0. A count drops to 0. Window invalid.

ADOBECODEBANCleftrightA:0/1 B:1/1 C:1/1formed = 2 / 3

Lost A. Stop shrinking, resume expanding.

Step 5: Expand to index 10 ('A'). All three found again! Window = 'DOBECODEBA', length 10.

ADOBECODEBANCleftrightA:1/1 B:2/1 C:1/1formed = 3 / 3

Valid but longer. best is still 6. Shrink.

Step 6: Keep shrinking. D, O not needed. Remove B at index 3: B count 2 to 1, still valid.

ADOBECODEBANCleftrightA:1/1 B:1/1 C:1/1formed = 3 / 3

Still valid at length 7. Keep shrinking.

Step 7: Remove 'E' at 4, then 'C' at 5. C drops to 0. Invalid at left = 6.

ADOBECODEBANCleftrightA:1/1 B:1/1 C:0/1formed = 2 / 3

Best before losing C was 'ECODEBA' = 7. Still best = 6.

Step 8: Expand to index 12 ('C'). Valid again! Window = 'ODEBANC', length 7.

ADOBECODEBANCleftrightA:1/1 B:1/1 C:1/1formed = 3 / 3

Valid! length = 7. best still 6. Shrink again.

Step 9: Shrink past O, D, E. Remove B at 9. Still valid! Window = 'BANC', length 4.

ADOBECODEBANCleftrightA:1/1 B:1/1 C:1/1formed = 3 / 3

best = 4! 'BANC' is shorter than 'ADOBEC'.

Step 10: Shrink once more. Remove B at 9. B drops to 0. Done.

ADOBECODEBANCleftrightA:1/1 B:0/1 C:1/1formed = 2 / 3

Invalid. Right is at end of string. Final answer: 'BANC', length 4.

The answer is "BANC" with length 4. Notice the expand-then-shrink rhythm: we expand right until the window is valid, then shrink left as much as possible while recording every valid window. This is the "shrinking window" variant of sliding window, the same one used for Minimum Size Subarray Sum.

The code

Here is the complete solution in Python:

def min_window(s, t):
    from collections import Counter

    if not s or not t or len(s) < len(t):
        return ""

    t_count = Counter(t)
    required = len(t_count)  # unique chars in t

    window_counts = {}
    formed = 0  # how many unique chars have met their required count

    best = (float("inf"), 0, 0)  # (length, left, right)
    left = 0

    for right in range(len(s)):
        # Expand: add s[right] to window
        ch = s[right]
        window_counts[ch] = window_counts.get(ch, 0) + 1

        # If this char is in t and we just hit the required count, bump formed
        if ch in t_count and window_counts[ch] == t_count[ch]:
            formed += 1

        # Shrink: while the window is valid, try to make it smaller
        while formed == required:
            # Record this valid window if it is the best so far
            if right - left + 1 < best[0]:
                best = (right - left + 1, left, right)

            # Remove s[left] from window
            leaving = s[left]
            window_counts[leaving] -= 1

            # If this char is in t and we just dropped below required, decrement formed
            if leaving in t_count and window_counts[leaving] < t_count[leaving]:
                formed -= 1

            left += 1

    return "" if best[0] == float("inf") else s[best[1]:best[2] + 1]

Let's walk through the important parts:

  1. t_count is a Counter that tells us how many of each character we need. For t = "ABC", that is {A: 1, B: 1, C: 1}.
  2. required is the number of unique characters in t. We need formed to match this number.
  3. window_counts tracks character frequencies in the current window, updated incrementally.
  4. When we add s[right] and its count reaches t_count[ch], we have enough of that character, so formed goes up.
  5. The while formed == required loop is the shrinking phase. We keep shrinking from the left while the window is valid, recording the best length each time.
  6. When removing s[left] causes its count to drop below what is needed, formed goes down and the window becomes invalid. The while loop exits and we go back to expanding.

The while loop inside the for loop looks like O(n^2) but is not. The left pointer only moves forward and moves at most n times total. So both the expanding and shrinking do O(n) work combined. Total: O(n).

Why == and not >= for the formed check

A subtle detail: when we check window_counts[ch] == t_count[ch], we use ==, not >=. Why? Because we only want to increment formed the moment we reach the required count, not every time we exceed it. If we used >=, we would increment formed multiple times for the same character whenever duplicates keep arriving.

Similarly, when shrinking, we check window_counts[leaving] < t_count[leaving]. We decrement formed only when the count drops below the requirement, not every time it decreases. This keeps formed accurate as a count of satisfied characters.

Complexity analysis

Time: O(n + m) where n is the length of s and m is the length of t. Building t_count takes O(m). The sliding window does at most 2n work (each character is added once by right and removed at most once by left). Each addition and removal does O(1) work.

Space: O(n + m). t_count holds at most m entries. window_counts holds at most n entries (though in practice it is bounded by the alphabet size). The output string takes at most O(n) space.

ApproachTimeSpace
Brute force (all substrings)O(n^2 * m)O(n + m)
Sliding window + formed counterO(n + m)O(n + m)

The sliding window solution is optimal. You cannot do better than O(n) because you need to look at every character in s at least once.

Handling duplicate characters in t

One thing that trips people up: t can have duplicate characters. For example, t = "AABC" means you need at least two A's, one B, and one C. The Counter handles this naturally since t_count will be {A: 2, B: 1, C: 1} and required will be 3 (three unique characters). The formed counter increments for A only when window_counts[A] reaches 2, not 1.

This is exactly why we use a frequency counter instead of a simple set. A set would tell you that A is present but not whether you have enough of them.

Common mistakes

1. Checking validity by scanning t_count on every step. The whole point of the formed counter is to avoid this O(m) check. If you loop through t_count to verify the window on each iteration, your solution becomes O(n * m) instead of O(n + m). Use formed == required instead.

2. Incrementing formed on every character match. You should only bump formed when a character's window count exactly equals its required count. If you increment it every time you add a matching character, formed will be wrong and your shrinking logic will break.

3. Forgetting to handle the case where t is longer than s. If len(t) > len(s), no valid window exists. Return "" immediately.

4. Not decrementing formed when shrinking. When left moves past a required character and its count drops below the target, you must decrement formed. Missing this means your window will keep shrinking past the point of validity.

5. Using the wrong window variant. This is a minimum window problem, so you shrink while valid. If you shrink while invalid (the maximum window variant), you will get wrong answers. This is a surprisingly common mistake if you have been solving "longest substring" problems right before this one.

The building blocks

This problem decomposes into three reusable pieces that show up again and again in sliding window hard problems:

1. Sliding window mechanics

The two-pointer skeleton where right always advances and left advances only during the shrinking phase. Both pointers move monotonically forward, guaranteeing O(n) total work. This is the exact same structure used in Minimum Size Subarray Sum and Longest Substring Without Repeating Characters.

left = 0
for right in range(len(s)):
    # expand: process s[right]
    while window_is_valid():
        # record best
        # shrink: process s[left], left += 1

2. Window validity condition

The logic that decides whether the current window satisfies the problem's constraint. Here, the condition is "the window contains at least the required frequency of every character in t." The formed == required check encapsulates this. In other problems, the condition might be "sum >= target" or "no duplicates." The sliding window skeleton stays the same; the validity condition plugs in.

3. Frequency counter

The data structure that tracks character frequencies inside the window, updated incrementally on each expand and shrink. This is the same building block used in Group Anagrams, Valid Anagram, and Permutation in String. The pattern is always the same: increment when expanding, decrement when shrinking, and use the counts to drive your validity check.

# Expand
window_counts[ch] = window_counts.get(ch, 0) + 1

# Shrink
window_counts[leaving] -= 1

These three blocks snap together to solve Minimum Window Substring, and they recombine in different configurations for other hard problems. If you can write each one from memory, you can assemble the full solution for any sliding window variant without memorizing each problem individually.

Edge cases

Before submitting, verify your solution handles:

  • t longer than s: return "". No valid window can exist.
  • s equals t: the entire string is the minimum window.
  • All characters the same: s = "aaaa", t = "aa". The minimum window is "aa" (length 2).
  • No valid window: s = "abc", t = "d". Return "".
  • Single character: s = "a", t = "a". Return "a".

Related posts

Understanding this solution is step one. Reproducing it from scratch under time pressure in a week is the actual challenge. The formed counter trick, the == vs >= detail, the shrinking-while-valid structure: these are the kinds of things you forget unless you actively practice them. Spaced repetition locks them in by having you write the code at increasing intervals until the pattern is automatic.