Skip to content
← All posts

String Compression: Two Pointer In-Place Encoding

5 min read
leetcodeproblemmediumstringstwo-pointers

LeetCode String Compression (problem 443) asks you to compress an array of characters in place using a read-write two-pointer approach. Given a character array like ["a","a","b","b","c","c","c"], you group consecutive repeating characters, replace each group with the character followed by the group length (if the length is greater than 1), and return the new length of the array. The compressed result must be stored in the original array, not in a new one.

Inputa0a1b2b3c4c5c6Compresseda021b223c435
Input array of 7 characters compresses to 6 characters in-place. The function returns 6.

Why this problem matters

String Compression is a clean example of the two-pointer technique applied to in-place transformation. Unlike problems that simply filter or remove elements, this one requires you to both read groups and write variable-length output back into the same array. That makes the pointer coordination slightly trickier. The same pattern shows up in run-length encoding, data serialization, and any scenario where you need to compact a sequence without allocating extra space.

The key insight

You need two pointers: read to scan through the input and count consecutive groups, and write to place the compressed characters back into the array. The critical observation is that the compressed output is always the same length or shorter than the input. That means write never overtakes read, so you can safely overwrite the array as you go without losing data you have not yet processed.

For each group of consecutive identical characters, you write the character once. If the count is greater than 1, you also write each digit of the count. A count like 12 takes two write positions ("1" and "2"), not one.

The solution

def compress(chars):
    write = 0
    read = 0

    while read < len(chars):
        char = chars[read]
        count = 0

        while read < len(chars) and chars[read] == char:
            read += 1
            count += 1

        chars[write] = char
        write += 1

        if count > 1:
            for digit in str(count):
                chars[write] = digit
                write += 1

    return write

write = 0 and read = 0. Both pointers start at the beginning of the array. read will advance through every character, while write only advances when you place a character or digit into the output.

char = chars[read]. Before counting a group, you capture the current character. The inner while loop then advances read as long as the character matches, incrementing count each time.

chars[write] = char. Once the group is fully counted, you write the character at the write position and advance write by one.

if count > 1. If the group had more than one character, you convert the count to a string and write each digit individually. A single character (count of 1) gets no number after it, which matches the problem specification.

return write. After all groups are processed, write holds the length of the compressed array. Everything in chars[0] through chars[write - 1] is the valid compressed result.

Visual walkthrough

Trace through chars = ["a","a","b","b","c","c","c"]. Watch how read jumps ahead to count each group while write places the character and its count into the front of the array.

Start: read scans the first group of 'a's.

aabbcccwriteread

Both pointers start at index 0. read will count how many 'a's appear consecutively.

Step 1: Found 2 'a's. Write 'a' then '2'.

a2bbcccwriteread

read counted 2 copies of 'a'. write placed 'a' at index 0 and '2' at index 1. write is now at 2, read is at 2.

Step 2: read scans the group of 'b's. Found 2.

a2b2cccwriteread

read counted 2 copies of 'b'. write placed 'b' at index 2 and '2' at index 3. write is now at 4, read is at 4.

Step 3: read scans the group of 'c's. Found 3. Write 'c' then '3'.

a2b2c3cwrite

read counted 3 copies of 'c'. write placed 'c' at index 4 and '3' at index 5. write is now at 6. read has passed the end.

Notice how write stays behind or even with read at every step. By the end, the first 6 positions hold ["a","2","b","2","c","3"] and the function returns 6.

Complexity analysis

MetricValue
TimeO(n), single pass through the array
SpaceO(1), no extra data structures

The read pointer visits each element exactly once. The write pointer advances at most n times total. Converting the count to digits is bounded by O(log n) per group in the worst case, but summed across all groups it is still O(n). No auxiliary arrays or hash maps are used.

Edge cases

  • All unique characters. Every group has count 1, so no digits are written. The array stays the same and you return the original length.
  • Single character. The array has one element. One group of count 1. You write the character and return 1.
  • All identical characters. One group with a large count. You write the character followed by each digit of the count. For example, ["a"] * 12 becomes ["a","1","2"] with a return value of 3.
  • Counts with multiple digits. If a group has 100 or more repeats, str(count) produces multiple digits. The loop handles this naturally by writing each digit one at a time.
  • Already compressed. If no character repeats, the array is untouched and you return len(chars).

The building blocks

This problem decomposes into two reusable bricks.

Read-write two pointers (in-place transformation)

The read pointer scans the input. The write pointer marks where the next output element should go. This is the same skeleton used in Remove Element, Remove Duplicates from Sorted Array, and Move Zeroes. The difference here is that write sometimes advances by more than one position per group, because you may need to write both a character and its count digits.

Run-length encoding

Counting consecutive identical elements and replacing them with a (character, count) pair is a classic encoding technique. In this problem you flatten the pair into the array itself, character first, then the digits of the count. This same idea appears in data compression, serialization formats, and problems like Count and Say (LeetCode 38).

From understanding to recall

The two-pointer skeleton is familiar if you have solved Remove Element or Remove Duplicates. What makes String Compression worth practicing separately is the added complexity of writing multi-digit counts back into the array. That small twist changes the pointer arithmetic enough that you need to practice it on its own.

Write the solution from scratch on a blank screen. Trace through a case with single characters, a case with double characters, and a case where the count exceeds 9. Once you can produce the code without hesitation, space out your reviews. The goal is to make the group-count-write loop as automatic as the simple read-write filter.

Related posts

  • Two Sum - The classic hash map complement pattern and a good warm-up for index-based thinking.
  • Remove Duplicates from Sorted Array - Uses the same read-write pointer skeleton to compact an array in place.
  • Remove Element - The simplest form of the read-write pointer, filtering values without extra space.