Skip to content
← All posts

Count and Say: Run-Length Encoding Sequence

5 min read
leetcodeproblemmediumstrings

LeetCode 38, Count and Say, asks you to generate the nth term of the count-and-say sequence. The sequence starts with "1", and each subsequent term is produced by reading the previous term aloud.

Here is how it works. You look at consecutive groups of the same digit and describe each group as "count" followed by "digit":

  • "1" is read as "one 1", which produces "11"
  • "11" is read as "two 1s", which produces "21"
  • "21" is read as "one 2, one 1", which produces "1211"
  • "1211" is read as "one 1, one 2, two 1s", which produces "111221"

Given an integer n, return the nth term of this sequence.

n=11n=211 read prev: "one 1"n=321 read prev: "two 1s"n=41211 read prev: "one 2, one 1"n=5111221 read prev: "one 1, one 2, two 1s"
The first five terms of the count-and-say sequence. Each term is produced by reading aloud the groups of consecutive digits in the previous term.

The key observation is that each term is just the run-length encoding of the previous term. If you know how to do run-length encoding on a string, you can solve this problem by applying it repeatedly.

The approach: iterate and encode

The plan is simple:

  1. Start with the base case: "1".
  2. Repeat n - 1 times: take the current string and produce the next term by run-length encoding it.
  3. To run-length encode a string, scan left to right. Track the current digit and how many consecutive copies of it you see. When the digit changes (or you reach the end), append the count and the digit to the result.

That is the entire algorithm. The outer loop controls how many terms to generate. The inner loop handles the encoding of one term into the next.

Run-length encoding is a technique where you replace consecutive repeated characters with a count followed by the character. For example, "aaabbc" becomes "3a2b1c". In this problem, the format is slightly different (count before digit, digits only), but the scanning logic is the same.

The code

def countAndSay(n):
    result = "1"
    for _ in range(n - 1):
        nxt = ""
        i = 0
        while i < len(result):
            digit = result[i]
            count = 0
            while i < len(result) and result[i] == digit:
                count += 1
                i += 1
            nxt += str(count) + digit
        result = nxt
    return result

Walk through what happens:

  1. Initialize result to "1", the first term of the sequence.
  2. Loop n - 1 times. Each iteration transforms result into the next term.
  3. Inside the loop, start a pointer i at position 0 and build nxt as an empty string.
  4. For each group of consecutive identical digits, count how many there are using the inner while loop. The pointer i advances past the entire group.
  5. Append str(count) + digit to nxt. This is the "say" part: "3 copies of '1'" becomes "31".
  6. After scanning the entire string, replace result with nxt.
  7. After all iterations, return result.

Visual walkthrough

Here is a step-by-step trace showing how each term is produced from the previous one by identifying groups and encoding them.

Step 1: n=1 to n=2

prev11x'1'next11

The string "1" has one group: a single '1'. Reading it aloud: "one 1" = "11".

Step 2: n=2 to n=3

prev112x'1'next21

The string "11" has one group: two '1's. Reading it aloud: "two 1s" = "21".

Step 3: n=3 to n=4

prev211x'2'1x'1'next1211

The string "21" has two groups: one '2', then one '1'. Reading it aloud: "one 2, one 1" = "1211".

Step 4: n=4 to n=5

prev12111x'1'1x'2'2x'1'next111221

The string "1211" has three groups: one '1', one '2', two '1's. Reading it aloud: "one 1, one 2, two 1s" = "111221".

Notice how the groups in the previous term map directly to pairs (count, digit) in the next term. Each group of consecutive identical digits becomes exactly two characters in the output.

Complexity analysis

MetricValue
TimeO(n * m), where m is the length of the longest term generated
SpaceO(m), for storing the current and next term

You iterate n - 1 times. Each iteration scans the current term once and builds the next term. The length of the terms grows roughly exponentially (each term is about 30% longer than the previous one on average), so m grows with n. For the constraints typically given (n up to 30), the terms remain manageable.

Building blocks

This problem is built from two reusable techniques that appear in many string problems.

Run-length encoding

The core of this problem is run-length encoding: scanning a string for consecutive groups of identical characters and recording the count and character. This same scanning pattern appears whenever you need to compress, describe, or transform a string based on repeated characters.

Other problems that use this pattern:

  • String Compression (LeetCode 443) asks you to compress a string in place using run-length encoding.
  • Decode String (LeetCode 394) is the reverse direction, expanding encoded patterns.

Iterative string building

The outer loop repeatedly transforms a string by building a new one from scratch. This pattern of "read the current state, produce the next state" is common in simulation and sequence-generation problems. You maintain one string as the current state and construct the next string character by character.

Other problems that use this pattern:

  • Zigzag Conversion (LeetCode 6) iteratively distributes characters into row buckets.
  • Look-and-Say variants in combinatorics use the same iterative generation approach.

Edge cases

n = 1. The first term is always "1". The loop runs zero times and the base case is returned directly.

Large n. The terms grow in length as n increases. At n = 30, the term is over 4,000 characters long. The algorithm handles this correctly because it only ever holds two strings in memory (the current term and the next term). There are no stack or recursion depth concerns.

Single-digit groups. When every digit in a term is different from its neighbors (like "1211"), each group has a count of 1. The encoding still works correctly, producing pairs like "11", "12", "11".

Multi-digit counts. For this specific sequence starting from "1", no group ever exceeds a count of 3 (this is a known property proved by John Conway). So the count is always a single digit, and each group always produces exactly two characters in the output.

From understanding to recall

You can read through this solution and follow every step. But the pieces that slip away over time are the specific ones: initializing the outer loop to n - 1 iterations (not n), using a while loop with a pointer instead of a for loop so you can advance past groups, and remembering to convert the count to a string before concatenating.

Spaced repetition locks these details in. You drill the run-length encoding scan as one brick and the iterative string transformation as another. Each rep takes a minute or two. After a few rounds at increasing intervals, the inner while loop and the str(count) + digit pattern are automatic. When you see a sequence generation problem in an interview, you are not puzzling over the encoding logic. You are snapping together bricks you already own.

Related posts

  • Longest Common Prefix - Another string problem where character-by-character scanning across positions is the core technique
  • Zigzag Conversion - A medium string problem that uses iterative string building with a simulation loop