Skip to content
← All posts

Groups of Special-Equivalent Strings: Canonical Key Grouping

4 min read
leetcodeproblemmediumarraysstringshash-map

Groups of Special-Equivalent Strings is LeetCode 893, and it is a satisfying twist on the classic Group Anagrams problem. Instead of grouping strings that have the exact same characters, you are grouping strings that can be transformed into each other by swapping characters at even indices with each other and odd indices with each other. The trick is figuring out what "equivalent" really means here, and then reducing the problem to a familiar hash map pattern.

The problem

You are given an array of strings words. Two strings are special-equivalent if you can make them equal by performing any number of moves, where each move consists of swapping two characters at even indices, or swapping two characters at odd indices.

Return the number of groups of special-equivalent strings.

Example: words = ["abcd", "cdab", "cbad", "xyzz", "zzxy", "zzyx"]

Output: 2

input:abcdcdabcbadxyzzzzxyzzyxgroup by signatureeven: ac, odd: bdabcdcdabcbadeven: xz, odd: yzxyzzzzxyzzyx
Strings are grouped by their special-equivalent signature: sorted even-index characters + sorted odd-index characters. Two groups emerge from six strings.

The key observation is that swapping among even indices can rearrange the even-indexed characters into any order. Same for the odd-indexed characters. So two strings are special-equivalent if and only if they have the same multiset of even-index characters and the same multiset of odd-index characters.

The approach

Once you realize that special-equivalence boils down to "same characters at even positions and same characters at odd positions," the algorithm is clean:

  1. For each string, extract the characters at even indices and the characters at odd indices
  2. Sort each group independently to create a canonical form
  3. Concatenate the two sorted groups to form a key
  4. Use a set to count the number of distinct keys

That is it. The number of distinct keys is the number of groups.

def num_special_equiv_groups(words: list[str]) -> int:
    keys = set()
    for word in words:
        even = sorted(word[i] for i in range(0, len(word), 2))
        odd = sorted(word[i] for i in range(1, len(word), 2))
        keys.add(tuple(even + odd))
    return len(keys)

Step-by-step walkthrough

Let's trace through the example ["abcd", "cdab", "cbad", "xyzz", "zzxy", "zzyx"] to see exactly how the algorithm builds each key and groups the strings.

Step 1: Split "abcd" into even-index and odd-index characters.

string:a0b1c2d3evenodd

Even indices (0, 2) give us characters that can only swap with each other. Same for odd indices (1, 3).

Step 2: Collect the even-index and odd-index characters separately.

even (0, 2):acodd (1, 3):bd

From "abcd": even-index chars = [a, c] and odd-index chars = [b, d].

Step 3: Sort each group to create a canonical signature.

sorted even:ac"ac"sorted odd:bd"bd"

Sorting normalizes the order so that equivalent strings produce the same key.

Step 4: Combine sorted groups into the key: "ac" + "bd" = "acbd".

key:acbd"acbd"

This key uniquely identifies the special-equivalence class for "abcd".

Step 5: Verify "cdab" produces the same key.

string:c0d1a2b3even [c,a]:sort"ac"odd [d,b]:sort"bd"key:"acbd" ← same key!

"cdab" has even chars [c, a] which sort to "ac", and odd chars [d, b] which sort to "bd". Same key: "acbd".

Step 6: Group all strings by their key in a hash map.

groups map:"acbd"[abcd, cdab, cbad]"xzyz"[xyzz, zzxy, zzyx]

After processing all 6 strings, the map has 2 entries. The number of distinct keys is our answer: 2 groups.

After processing all six strings, we have two distinct keys in the set. That means there are two groups of special-equivalent strings.

Complexity analysis

MetricValue
TimeO(n * k log k)
SpaceO(n * k)

Time: We iterate through n strings. For each string of length k, we extract even and odd characters in O(k) and sort each half in O(k/2 log k/2), which simplifies to O(k log k). Total: O(n * k log k).

Space: We store up to n keys in the set, and each key has length k. So the set takes O(n * k) space.

The building blocks

Canonical form

The core technique here is canonicalization, the same idea behind Group Anagrams. You take multiple representations of the same equivalence class and reduce them to a single standard form. In Group Anagrams, you sort the entire string. Here, you sort the even-index and odd-index characters separately, because that is the equivalence relation the problem defines.

The general template is: figure out what transformations are allowed, determine which invariants those transformations preserve, and use those invariants as your key.

Group-by-key with a set

This problem is a variation of the group-by-key pattern. Instead of actually collecting the groups (like in Group Anagrams where you need to return the grouped strings), you only need to count how many distinct groups exist. That means you can use a set instead of a dictionary. Add each canonical key to the set, and the set's size is your answer.

This simplification comes up often. When the problem asks "how many groups?" rather than "what are the groups?", a set is all you need.

Edge cases

All strings are equivalent. If every string produces the same key, the answer is 1. For example, ["abc", "bac"] where swapping even-index characters a and b (wait, index 0 and 2 are even) turns one into the other. The algorithm handles this naturally.

Every string is unique. If no two strings share a key, the answer equals the length of the input array. No special handling needed.

Single-character strings. A string like "a" has even-index characters ["a"] and no odd-index characters. The key is ("a",). Works correctly.

Empty strings. An empty string has no even or odd characters, so the key is an empty tuple (). Multiple empty strings all map to the same key, giving one group. The algorithm handles this without any special case.

From understanding to recall

This problem feels easy once you see the trick: sort even-index and odd-index characters separately, then group by that signature. But the real building block is recognizing when a problem is asking you to find equivalence classes, and knowing that the approach is always the same: define a canonical key that captures the invariant, then group or count by that key. Whether it is anagram grouping, shifted string grouping, or special-equivalence grouping, the template does not change. Only the key function does.

Spaced repetition helps you internalize that template so deeply that when you encounter a new equivalence-class problem, you skip the "how do I approach this?" phase entirely and jump straight to "what is the right key function?"

Related posts

  • Group Anagrams - The classic canonical key grouping problem, using sorted characters as the key
  • Valid Anagram - The single-pair version of character equivalence checking
  • Hash Map Patterns - The grouping pattern used here is one of the three core hash map techniques