Expressive Words: Two-Pointer Group Matching
LeetCode 809, Expressive Words, gives you a string s and an array of query words. Your job is to count how many of those words are "stretchy," meaning you could transform the word into s by extending groups of consecutive identical characters so that each group has 3 or more characters.
For example, given s = "heeellooo" and words = ["hello", "hi", "helo"], only "hello" can be stretched to match s. You can extend the "e" group from 1 to 3 and the "o" group from 1 to 3. The answer is 1.
Why this problem matters
This problem tests your ability to decompose strings into groups and compare them using two pointers. The same "run-length group comparison" technique shows up in problems about string compression, pattern matching, and encoding. Once you learn to think in terms of character groups rather than individual characters, a whole class of string problems becomes easier.
The approach: two-pointer group comparison
The key insight is that both s and each word can be broken into groups of consecutive identical characters. For "heeellooo", the groups are [('h',1), ('e',3), ('l',2), ('o',3)]. For "hello", the groups are [('h',1), ('e',1), ('l',2), ('o',1)].
A word is stretchy if:
- Both strings produce the same sequence of characters (ignoring counts).
- For each pair of corresponding groups, one of these is true:
- The counts are exactly equal (no stretching needed).
- The count in
sis greater than or equal to 3 AND the count insis greater than or equal to the count in the word (stretching is valid because the group insis "long enough").
The second rule captures the problem's constraint: you can only stretch a group if the result has 3 or more characters. You cannot stretch a group of 1 into a group of 2, because 2 is not at least 3.
Using two pointers, you walk through s and the word simultaneously, consuming one group at a time from each. If at any point the characters differ, or the counts violate the rules above, the word is not stretchy.
The trick to this problem is the group size rule. A group in s can match a group in the word only if the counts are equal, or if the s group has 3 or more characters and is at least as large as the word group. A group of size 2 in s with a different size in the word is always invalid.
The code
def expressiveWords(s, words):
def get_groups(w):
groups = []
i = 0
while i < len(w):
ch = w[i]
count = 0
while i < len(w) and w[i] == ch:
count += 1
i += 1
groups.append((ch, count))
return groups
def is_stretchy(word):
sg = get_groups(s)
wg = get_groups(word)
if len(sg) != len(wg):
return False
for (sc, sn), (wc, wn) in zip(sg, wg):
if sc != wc:
return False
if sn == wn:
continue
if sn >= 3 and sn >= wn:
continue
return False
return True
return sum(1 for w in words if is_stretchy(w))
Walk through what happens:
get_groupsscans a string left to right with a pointer, collecting consecutive runs of the same character into(char, count)pairs.is_stretchybuilds the group lists for bothsand the candidate word. If they have different numbers of groups, the word cannot match.- For each pair of groups, it checks: are the characters the same? Are the counts compatible? If the counts differ, the s group must have at least 3 characters and be at least as large as the word group.
- The main function counts how many words pass the
is_stretchycheck.
Visual walkthrough
Here is a step-by-step trace showing how each word from the example is checked against s = "heeellooo". For each word, we compare the character groups side by side.
Word 1: "hello"
Result: stretchy
All groups match. We can stretch "e" (1 to 3) and "o" (1 to 3). This word is stretchy.
Word 2: "hi"
Result: not stretchy
After the first group, s has 'e' but the word has 'i'. Different characters means no match. Not stretchy.
Word 3: "helo"
Result: not stretchy
The 'l' group in s has size 2, but the word has size 1. Since 2 is not equal to 1 and 2 < 3, we cannot stretch. Not stretchy.
Final Answer
Only "hello" is stretchy. Return 1.
Notice how "hello" passes because every group either has equal counts or the s group is large enough (at least 3) to justify the stretch. "hi" fails immediately on a character mismatch. "helo" fails on the "l" group because 2 is not equal to 1 and 2 is less than 3.
Complexity analysis
| Metric | Value |
|---|---|
| Time | O(n * m), where n is the number of words and m is the max length of s or a word |
| Space | O(m), for storing the group lists |
For each word, you scan through both s and the word once to build groups, then compare the groups in one pass. The total work is proportional to the number of words times the length of the strings.
Building blocks
This problem is built from two reusable techniques that appear in many string problems.
Run-length grouping
The core of this problem is decomposing a string into runs of consecutive identical characters. This is the same grouping logic used in run-length encoding. You scan left to right with a pointer, tracking the current character and counting how many consecutive copies you see. When the character changes, you emit a group and start the next one.
Other problems that use this pattern:
- Count and Say (LeetCode 38) uses the same run-length scanning to describe one term from the previous.
- String Compression (LeetCode 443) compresses a string in place using run-length encoding.
Two-pointer comparison
After grouping, you compare the two group lists element by element. This is a classic two-pointer walk where both pointers advance in lockstep. The same pattern appears whenever you need to compare two sequences position by position, such as merging sorted arrays or comparing version numbers.
Edge cases
Word longer than s. If a word has more characters than s, it will produce more groups or larger groups. The group count check or the size comparison will catch this.
Identical strings. If a word is exactly equal to s, every group has equal counts. The function correctly returns True since equal counts always satisfy the matching rule.
Single-character groups in s. If a group in s has size 1 or 2, the word group must have the exact same size. You cannot "stretch" a group to size 2 because 2 is not at least 3. This is the subtle rule that trips up "helo" in the example.
Empty words array. If words is empty, the sum is 0. No iteration occurs.
Groups with count exactly 3. If the s group has 3 characters and the word group has 1, that is valid. You stretched 1 to 3, and 3 is at least 3. If the s group has 3 and the word group has 4, that is invalid because 3 is less than 4.
From understanding to recall
You can follow the group comparison logic now, but the details that slip away are the specific matching rules: when is a stretch valid? What is the threshold? Is it "greater than 3" or "at least 3"? And do you compare the group counts of s against the word, or the other way around?
Spaced repetition locks these details in. You drill the run-length grouping as one brick and the comparison rules as another. Each rep takes a minute or two. After a few rounds at increasing intervals, the sn >= 3 and sn >= wn condition is automatic. When you see a string grouping problem in an interview, you are not second-guessing the threshold. You are assembling bricks you already own.
Related posts
- Count and Say - Another string problem built on run-length encoding, where consecutive groups of identical characters drive the entire algorithm
- Compare Version Numbers - A two-pointer comparison problem where you walk through two sequences in lockstep and compare segment by segment
- Longest Substring Without Repeating Characters - A sliding window string problem that also requires careful pointer management and character tracking