Skip to content
← All posts

Number of Unique Good Subsequences: DP Without Leading Zeros

7 min read
leetcodeproblemhardstringsdynamic-programming

You are given a binary string binary. A subsequence of binary is "good" if it is non-empty, has no leading zeros (except for the subsequence "0" itself), and is unique. Return the number of unique good subsequences modulo 10^9 + 7.

This is LeetCode 1987: Number of Unique Good Subsequences. For example, binary = "101" returns 5. The five unique good subsequences are "0", "1", "10", "11", and "101". Notice that "01" is not included because it has a leading zero.

If you have solved Distinct Subsequences or Decode Ways, you already know how to count subsequences with DP. The twist here is the leading-zero constraint: you cannot just count all subsequences. You need a way to guarantee every counted subsequence starts with "1" (or is exactly "0").

init101ends1ends0has_zerototal00no010no111yes331yes5Updated this stepFinal answer
DP trace for binary="101". Track subsequences ending in 1, ending in 0, and whether "0" itself exists. Answer: 3 + 1 + 1 = 5.

Why this problem matters

Counting subsequences is a classic DP skill, but most problems let you count freely without worrying about what the subsequence looks like. This problem adds a structural constraint: no leading zeros. That single rule forces you to rethink how you build up the count.

The trick is to split your DP into two variables: how many unique subsequences end in "0", and how many end in "1". By tracking endings instead of beginnings, you can extend subsequences safely. Any subsequence that currently ends in "1" must have started with "1" (since it was built from prior valid subsequences). When you see a "0" in the input, you extend existing valid subsequences by appending "0", which preserves the no-leading-zero property. The only special case is the subsequence "0" by itself, which you handle with a boolean flag.

This pattern of splitting DP state by the last character appears in many counting problems on binary strings and sequences with restricted prefixes.

The approach

Define two variables:

  • ends1: number of unique good subsequences that end with "1"
  • ends0: number of unique good subsequences that end with "0" (excluding the standalone "0")
  • has_zero: a flag for whether "0" appears in binary (so the standalone subsequence "0" counts)

Process each character in binary left to right:

  • If the character is "1": every existing subsequence (whether it ends in "0" or "1") can be extended by appending "1". Plus, you can start a brand new subsequence "1". So ends1 = ends0 + ends1 + 1.
  • If the character is "0": every existing subsequence can be extended by appending "0". But you cannot start a new subsequence with "0" (that would be a leading zero). Instead, you note that the standalone "0" exists by setting has_zero = true. So ends0 = ends0 + ends1.

The answer is ends0 + ends1 + has_zero, all taken modulo 10^9 + 7.

The key insight: when you see a "0", do not add 1 for a new subsequence starting with "0". That would create a leading zero. Instead, track "0" as a special case with a boolean flag. All other subsequences ending in "0" were built by extending valid subsequences that already started with "1".

Why uniqueness is handled automatically

You might wonder: how does this DP avoid counting duplicate subsequences? The answer is that the recurrence replaces ends0 and ends1 rather than adding to a running total. When you process a "1", the new ends1 is the full set of unique subsequences that end in "1" given all characters seen so far. It does not add to the old ends1. It overwrites it. This is the same technique used in problems like "number of distinct subsequences of a string," where the key to avoiding duplicates is to redefine the count at each step rather than accumulate.

Python solution

def numberOfUniqueGoodSubsequences(binary):
    MOD = 10**9 + 7
    ends0 = 0
    ends1 = 0
    has_zero = 0

    for ch in binary:
        if ch == "1":
            ends1 = (ends0 + ends1 + 1) % MOD
        else:
            ends0 = (ends0 + ends1) % MOD
            has_zero = 1

    return (ends0 + ends1 + has_zero) % MOD

Time: O(n). One pass through the string with constant work per character.

Space: O(1). Only three variables, regardless of input size.

Step-by-step walkthrough

Let's trace through binary = "101" character by character.

Initialize: ends0 = 0, ends1 = 0, has_zero = false

binary:101ends1:0ends0:0has_zero:0total:0 + 0 + 0 = 0{}

Before processing any character, there are no subsequences yet.

Step 1: Process char "1"

binary:101ends1:1ends0:0has_zero:0total:1 + 0 + 0 = 1{"1"}

Char is "1": ends1 = ends0 + ends1 + 1 = 0 + 0 + 1 = 1. We can start a new subsequence "1". ends0 stays 0.

Step 2: Process char "0"

binary:101ends1:1ends0:1has_zero:1total:1 + 1 + 1 = 3{"1", "10", "0"}

Char is "0": ends0 = ends0 + ends1 = 0 + 1 = 1. We extend "1" to "10". We do NOT start a new subsequence "0" in ends0 (leading zero), but we set has_zero = true to count it separately.

Step 3: Process char "1"

binary:101ends1:3ends0:1has_zero:1total:3 + 1 + 1 = 5{"1", "10", "0", "11", "101"}

Char is "1": ends1 = ends0 + ends1 + 1 = 1 + 1 + 1 = 3. Existing subsequences "1" and "10" extend to "11" and "101", plus a new "1" (but "1" already exists, so uniqueness is handled by the DP). Final answer: 3 + 1 + 1 = 5.

The final answer is ends1 + ends0 + has_zero = 3 + 1 + 1 = 5.

Walking through the transitions

A few transitions deserve closer attention because they reveal how the recurrence prevents leading zeros and duplicates.

Step 1 (char "1"), ends1 goes from 0 to 1. This is the simplest case. There are no prior subsequences. The + 1 in the formula ends1 = ends0 + ends1 + 1 starts the very first subsequence: "1". The + 1 only appears when the character is "1", never when it is "0". That is what prevents leading zeros.

Step 2 (char "0"), ends0 goes from 0 to 1. The formula ends0 = ends0 + ends1 takes every subsequence ending in "1" (just "1") and extends it with "0" to get "10". There is no + 1 here. You do not create a standalone "0" through ends0. Instead, has_zero flips to 1. This separation is the heart of the algorithm.

Step 3 (char "1"), ends1 goes from 1 to 3. Now ends0 = 1 and ends1 = 1. The formula gives ends1 = 1 + 1 + 1 = 3. Those three subsequences ending in "1" are: the original "1" extended from prior state, "10" extended to "101", and a brand new "1". But wait, "1" already exists. The overwrite ensures we count "1" only once. The three unique subsequences ending in "1" are "1", "11", and "101".

Complexity analysis

ApproachTimeSpace
Brute force (enumerate all 2^n subsequences)O(2^n)O(2^n)
DP with ends0 and ends1O(n)O(1)

Where n = length of binary. The DP approach is optimal in both time and space.

Building blocks

This problem is built on the subsequence counting DP pattern with state splitting by last character.

  • State: two counters, ends0 and ends1, plus a flag has_zero
  • Transition (char "1"): ends1 = ends0 + ends1 + 1 (extend all existing + start new)
  • Transition (char "0"): ends0 = ends0 + ends1 (extend all existing, no new start)
  • Base case: ends0 = 0, ends1 = 0, has_zero = 0
  • Answer: ends0 + ends1 + has_zero

The overwrite pattern (replacing ends1 instead of adding to it) is what handles uniqueness. This same technique appears in problems that count distinct subsequences of a single string.

Edge cases to watch for

  • All zeros (e.g., "000"): ends1 stays 0 the entire time. ends0 also stays 0 because there are no subsequences to extend. has_zero becomes 1. The answer is 1 (just the subsequence "0").
  • All ones (e.g., "111"): has_zero stays 0. ends0 stays 0. After processing 3 ones, ends1 grows from 0 to 1 to 2 to 3. The answer is 3: the unique good subsequences are "1", "11", and "111". Even though there are multiple ways to pick positions, only the distinct values matter.
  • Single character "0": The answer is 1. Only the subsequence "0" is valid.
  • Single character "1": The answer is 1. Only the subsequence "1" is valid.
  • String with no zeros: has_zero stays 0, so you never add the standalone "0". The answer is just ends1.
  • Very long strings: since we take modulo 10^9 + 7 at every step, overflow is never an issue in Python (or in languages with modular arithmetic).

From understanding to recall

You now understand why the DP splits into ends0 and ends1, why the + 1 only appears for "1" characters, and why "0" is tracked as a special case. But understanding the logic once is different from reproducing it under pressure.

In an interview, the details that trip people up are: forgetting the has_zero flag, accidentally adding + 1 when the character is "0", or adding to the previous value instead of overwriting. These are small details, but any one of them breaks the solution.

The state-splitting pattern (tracking counts by what the subsequence ends with) is one of roughly 60 reusable building blocks in the CodeBricks system. You practice each block individually with spaced repetition until writing the solution from scratch feels automatic. This problem pairs naturally with distinct subsequences and decode ways. Drilling all three builds the reflex for subsequence-counting DP so you never confuse the transitions.

Related posts

  • Distinct Subsequences - 2D DP for counting subsequences of one string that match another. Same counting-by-extension logic, but with a 2D table instead of two variables.
  • Decode Ways - Linear DP on a string with conditional transitions. Similar one-pass structure where validity constraints shape the recurrence.
  • Longest Palindromic Subsequence - Another subsequence DP problem, this time maximizing length rather than counting. Good contrast for understanding how the objective changes the recurrence.