Skip to content
← All posts

Maximum Length of Concatenated String: Backtracking with Bitmasks

7 min read
leetcodeproblemmediumarraysstringsbacktracking

Given an array of strings arr, find the maximum possible length of a string formed by concatenating a subsequence of arr where every character in the concatenation is unique.

This is LeetCode 1239: Maximum Length of a Concatenated String with Unique Characters, and it is a clean example of how bitmask tricks can supercharge a backtracking solution. The problem looks like it needs complex string manipulation, but once you represent each string's character set as an integer bitmask, overlap detection becomes a single AND operation and merging becomes a single OR.

arr = ["un", "iq", "ue"]"un"unmask: u,n set"iq"iqmask: i,q set"ue"uemask: u,e setConcatenate "un" + "iq" + "ue"?"un" AND "iq" = 0No overlap, merge"uniq" AND "ue" != 0"u" overlaps, skip "ue"Best with "ue" skipped:uniq"uniq" = length 4Or skip "un", take "iq"+"ue":ique"ique" = length 4Maximum length = 4string 1string 2string 3
For arr = ["un", "iq", "ue"], we try all subsets of strings. Bitmask AND detects character overlap instantly. The best unique concatenation has length 4.

Why this problem matters

This problem sits at the intersection of two powerful ideas: backtracking through subsets and using bitmasks as compact set representations. You already know from Subsets that enumerating all subsets of n elements means exploring a binary decision tree. This problem adds a constraint (characters must be unique across the concatenation) and asks for the maximum length rather than all subsets.

The bitmask angle is what makes it interesting. Instead of maintaining a set of used characters and checking membership with loops, you store the entire character set in a single 26-bit integer. Checking whether two strings share a character is just mask_a & mask_b != 0. Merging two character sets is mask_a | mask_b. Counting characters is bin(mask).count('1'). These O(1) operations replace O(k) set operations, and the technique shows up in many other problems involving small alphabets.

Once you have seen how bitmasks simplify set operations in a backtracking context, you will recognize the same pattern in constraint satisfaction problems, puzzle solvers, and competitive programming challenges.

The key insight

Each string in the array uses some subset of the 26 lowercase letters. You can encode that subset as a 26-bit integer: bit i is 1 if the character chr(ord('a') + i) appears in the string. Two strings can be concatenated without repeating characters if and only if their bitmasks have no overlapping bits, meaning their bitwise AND is zero.

The algorithm works like this:

  1. Precompute the bitmask for each string. If a string contains duplicate characters (like "aa"), set its mask to 0 so it gets skipped automatically.
  2. Backtrack through the strings with an include-or-skip decision at each index, exactly like generating subsets.
  3. At each step, if the current string's mask has no overlap with the accumulated mask (AND is zero), you can include it: OR the masks together and add the string's length.
  4. Track the maximum total length seen across all branches.

The solution

def maxLength(arr: list[str]) -> int:
    masks = []
    for s in arr:
        mask = 0
        for ch in s:
            bit = 1 << (ord(ch) - ord('a'))
            if mask & bit:
                mask = 0
                break
            mask |= bit
        if mask:
            masks.append((mask, len(s)))

    def backtrack(idx, current_mask, current_len):
        best = current_len
        for i in range(idx, len(masks)):
            m, l = masks[i]
            if current_mask & m == 0:
                best = max(best, backtrack(i + 1, current_mask | m, current_len + l))
        return best

    return backtrack(0, 0, 0)

The first loop converts each string into a (mask, length) pair. For each character, it computes the bit position and checks whether that bit is already set. If it is, the string has a duplicate character and we discard it by setting the mask to 0. Otherwise we set the bit. Only strings with nonzero masks make it into the masks list.

The backtrack function follows the standard subset enumeration pattern. Starting from index idx, it tries each remaining string. If the string's mask has no overlap with current_mask (the AND check), it includes the string by ORing the masks and adding the length, then recurses from i + 1. The function returns the best length found across all branches.

Notice that we do not need to "unchoose" anything. Instead of mutating state, we pass the merged mask and updated length as arguments to the recursive call. When the call returns, we are automatically back to the previous state. This is a common simplification in backtracking problems where the state is small enough to pass by value.

If a string has duplicate characters within itself (like "aa" or "abba"), it can never contribute to a valid concatenation. Filtering these out during the bitmask precomputation step saves unnecessary branches in the backtracking tree.

Visual walkthrough

Let's trace through the algorithm with arr = ["un", "iq", "ue"]. Watch how the bitmask AND check instantly detects character overlap and how backtracking explores every valid subset.

Step 1: Convert each string to a bitmask

"un"1u1n0i0q0ebits for u and n set"iq"0u0n1i1q0ebits for i and q set"ue"1u0n0i0q1ebits for u and e set

Each bit represents a letter (a=bit 0, b=bit 1, ..., z=bit 25). If a string has duplicate characters, its mask is 0 (discard it). Here we show just the relevant 5 bits.

Step 2: Start backtracking at index 0 with empty mask (all zeros)

current mask:0u0n0i0q0elength so far: 0Try including "un" at index 0

current_mask = 00000 (no characters used yet), current_length = 0. We decide: include "un" or skip it?

Step 3: Include "un". Check: current_mask AND mask("un") == 0? Yes, no overlap.

00000 AND 11000 = 00000No overlap, include itnew mask:1u1n0i0q0elength: 2Next: try including "iq" at index 1

Merge masks with OR: 00000 | 11000 = 11000. Length = 0 + 2 = 2. Now recurse to index 1.

Step 4: Include "iq". Check: 11000 AND 00110 == 0? Yes, no overlap.

11000 AND 00110 = 00000No overlap, include itnew mask:1u1n1i1q0elength: 4Next: try including "ue" at index 2

Merge: 11000 | 00110 = 11110. Length = 2 + 2 = 4. Now recurse to index 2.

Step 5: Try "ue". Check: 11110 AND 10001 != 0. Overlap on "u"!

11110 AND 10001 = 10000Overlap on "u", skipmask unchanged:1u1n1i1q0ebest so far: 4("un" + "iq" = "uniq")

The "u" bit is set in both masks. Skip "ue". No more strings to try. Record length = 4. Backtrack.

Step 6: Backtrack to mask=11000 ("un"). Skip "iq", try "ue" at index 2.

11000 AND 10001 = 10000Overlap on "u", skipmask stays:1u1n0i0q0ethis branch: length = 2best overall: 4

Check: 11000 AND 10001 != 0. Overlap on "u" again. Skip "ue". With just "un", length = 2.

Step 7: Backtrack to empty mask. Skip "un" entirely. Try "iq" at index 1.

00000 AND 00110 = 00000No overlap, includenew mask:0u0n1i1q0elength: 2Next: try including "ue" at index 2

mask = 00000. Check: 00000 AND 00110 = 0. No overlap. Include "iq". Mask becomes 00110, length = 2.

Step 8: Include "ue". Check: 00110 AND 10001 == 0? Yes, no overlap.

00110 AND 10001 = 00000No overlap, includenew mask:1u0n1i1q1elength: 4("iq" + "ue" = "ique")

Merge: 00110 | 10001 = 10111. Length = 2 + 2 = 4. Matches the best. Both paths give length 4.

Step 9: All branches explored. Maximum length = 4.

Path 1: "un" + "iq" = "uniq" (length 4)Path 2: "iq" + "ue" = "ique" (length 4)Answer: 4

Two subsets tie at length 4: {"un", "iq"} and {"iq", "ue"}. The answer is 4.

The walkthrough shows how two different subsets, {"un", "iq"} and {"iq", "ue"}, both achieve the maximum length of 4. The bitmask AND catches the overlap on "u" between "un" and "ue" in constant time, pruning that branch immediately.

Complexity analysis

ApproachTimeSpace
Backtracking with bitmasksO(2^n)O(n)

Time is O(2^n) in the worst case, where n is the number of valid strings (those without internal duplicates). Each string is either included or skipped, giving 2^n possible subsets to explore. In practice, the overlap check prunes many branches early, but the worst case occurs when all strings use completely disjoint character sets.

Space is O(n) for the recursion stack. The maximum recursion depth is n (one level per string). The masks list also uses O(n) space. We do not store all subsets, just track the running maximum.

The building blocks

1. Bitmask as a character set

The core trick is encoding a set of characters as an integer. Each of the 26 lowercase letters gets one bit:

mask = 0
for ch in s:
    mask |= 1 << (ord(ch) - ord('a'))

Checking overlap between two sets is a single AND. Merging two sets is a single OR. This replaces Python set operations with bitwise operations that run in O(1).

2. Subset enumeration with pruning

The backtracking structure is identical to Subsets, but with a constraint check before including each element:

for i in range(idx, len(items)):
    if can_include(items[i], current_state):
        best = max(best, backtrack(i + 1, merge(current_state, items[i])))

The can_include check (bitmask AND) prunes branches that would violate the uniqueness constraint. Without it, you would enumerate all 2^n subsets and check each one, which is the same time complexity but slower in practice.

Edge cases

  • All strings have duplicate characters: arr = ["aa", "bb"]. Every mask is 0, so the masks list is empty. The answer is 0.
  • Single string with all unique characters: arr = ["abcdefghijklmnop"]. Only one string, and it has 16 unique characters. The answer is 16.
  • All strings share a common character: arr = ["ab", "ac", "ad"]. Each pair overlaps on "a", so you can only pick one string. The answer is 2.
  • Empty strings in the array: arr = ["", "a", "b"]. An empty string has mask 0. After filtering, only "a" and "b" remain. The answer is 2.
  • Maximum possible answer: The best you can do is 26, since there are only 26 lowercase letters. If the input contains strings that together cover all 26 letters without overlap, the answer is 26.

From understanding to recall

You understand how bitmasks turn set operations into single instructions, and how the backtracking tree prunes branches where character sets overlap. But can you write the bitmask precomputation and the recursive backtracking from scratch in under three minutes?

That is what interviews demand. The bitmask idea is elegant on paper, but under pressure you might forget to handle strings with internal duplicates, get the bit shift wrong, or confuse AND (overlap check) with OR (merge). These are not conceptual gaps. They are recall problems.

Spaced repetition fixes recall. You practice writing the bitmask conversion loop and the backtracking function from memory at increasing intervals. After a few rounds, the pattern is automatic. You see "unique characters" and "concatenation" and the bitmask approach flows out without hesitation.

The bitmask-as-set pattern and the subset enumeration template are two of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Learning them individually and drilling them with spaced repetition is far more effective than grinding random problems and hoping the patterns stick.

Related posts

  • Subsets - The foundational backtracking problem using include-or-skip decisions. This problem adds a bitmask constraint on top of the same structure.
  • Letter Tile Possibilities - Another backtracking problem that explores subsets of characters. Good practice for reasoning about character sets and counting.
  • Word Search - Grid backtracking with visited-cell tracking. Compare its approach to this problem's bitmask approach for managing "used" state.

CodeBricks breaks this problem into its bitmask-as-set and subset-enumeration building blocks, then drills each one independently with spaced repetition. You type the bitmask conversion and the backtracking skeleton from scratch until they are automatic. When a problem asks you to combine strings or sets with uniqueness constraints, you do not think about it. You just write it.