Skip to content
← All posts

Longest Common Prefix: Column-by-Column Comparison

5 min read
leetcodeproblemeasystrings

LeetCode 14, Longest Common Prefix, asks you to find the longest string that is a prefix of every string in an array. If the strings share nothing in common at the start, return an empty string.

For example, given ["flower", "flow", "flight"], the answer is "fl". All three strings start with "fl", but at position 2 the characters diverge ('o', 'o', 'i').

012345flowerflowerflowflowflightflightprefix = "fl"
Three strings stacked vertically. Green columns share the same character across all strings. The red column is where they diverge.

The approach: vertical scanning

The cleanest way to think about this problem is column by column. Line the strings up vertically and scan from left to right. At each column (character position), check whether every string has the same character. The moment any string has a different character, or a string runs out of characters entirely, you stop. Everything to the left of that column is the common prefix.

This is called vertical scanning because you are reading down each column rather than across each row.

Here is why this works: the common prefix can only be as long as the shortest string. And the first point of disagreement across any column kills the prefix right there. By scanning left to right and stopping at the first mismatch, you never do unnecessary work.

You can use any string as the reference for column iteration. The first string, strs[0], is the natural choice. If the prefix turns out to be shorter than strs[0], you will hit a mismatch or an out-of-bounds condition before finishing.

The code

def longest_common_prefix(strs):
    if not strs:
        return ""
    for i in range(len(strs[0])):
        char = strs[0][i]
        for s in strs[1:]:
            if i >= len(s) or s[i] != char:
                return strs[0][:i]
    return strs[0]

Walk through what happens:

  1. If the input array is empty, return "".
  2. Loop through each index i of the first string.
  3. For each index, grab the character strs[0][i] and compare it against s[i] for every other string.
  4. If any string is too short (i >= len(s)) or has a different character (s[i] != char), return everything up to but not including position i.
  5. If you make it through the entire first string without a mismatch, the first string itself is the common prefix.

Visual walkthrough

Let's trace through ["flower", "flow", "flight"] one column at a time.

Step 1: Check column 0

012345flowerflowerflowflowflightflight

All three characters are 'f'. They match. Add 'f' to the prefix.

Step 2: Check column 1

012345flowerflowerflowflowflightflight

All three characters are 'l'. They match. Add 'l' to the prefix.

Step 3: Check column 2

012345flowerflowerflowflowflightflight

'o', 'o', 'i'. Not all the same. Stop immediately.

Step 4: Result

We stopped at column 2 because the characters diverged. Everything before that column is the common prefix: "fl"

The algorithm checked exactly 3 columns before finding the mismatch. It never looked at positions 3, 4, or 5 of any string. That early termination is the key to the approach's efficiency.

Alternative approaches

Vertical scanning is not the only way to solve this problem. Here are three other approaches you might see.

Horizontal scanning

Compare the first two strings to find their common prefix. Then compare that result with the third string to narrow it down further. Repeat for every string in the array.

This works fine, but it processes strings one at a time rather than character positions one at a time. If the first two strings are very long but share a short prefix with the third, you do extra comparison work up front that gets thrown away.

Sort first, then compare endpoints

Sort the array of strings lexicographically. After sorting, the first and last strings are the most different. The common prefix of just those two strings is the common prefix of the entire array.

This is clever, but sorting costs O(S log n) where S is the total number of characters and n is the number of strings. For most inputs, that is more expensive than a single linear scan.

Binary search on prefix length

Binary search on the length of the prefix. For a given candidate length, check if all strings share a prefix of that length. If they do, try a longer length. If not, try shorter.

This runs in O(S log m) where m is the length of the shortest string. It is overkill for this problem but useful to know as a technique.

Why vertical scanning wins

Vertical scanning is the simplest and most intuitive approach. It maps directly to how you would solve the problem by hand: line up the strings, read down each column, and stop when you see a difference. The code is short, there is nothing to sort or binary search over, and the time complexity is optimal.

Complexity analysis

MetricValue
TimeO(S), where S is the sum of all characters across all strings
SpaceO(1), no extra data structures

In the worst case (all strings are identical), you visit every character exactly once. In the best case (first characters differ), you visit only n characters where n is the number of strings.

Building blocks

This problem uses two fundamental techniques that appear across many string and array problems.

Character-by-character comparison

The core of vertical scanning is comparing characters at the same index across multiple strings. This same idea shows up in problems like checking if two strings are anagrams, finding mismatches between strings, and comparing sorted string representations.

Early termination

The algorithm stops the moment it finds a mismatch or a string that is too short. You do not finish scanning all columns and then figure out the answer. You return as soon as you have enough information. This pattern is critical in search problems, validation checks, and any problem where you can short-circuit a loop.

Edge cases

Single string. If the array contains exactly one string, that string is the entire common prefix. The inner loop never executes, and you return strs[0].

Empty string in the array. If any string is "", the answer is "" immediately. On the first iteration (i = 0), the check i >= len(s) triggers for the empty string and returns strs[0][:0], which is "".

All identical strings. If every string is the same, the loop runs through all characters of strs[0] without finding a mismatch. The function returns strs[0], which is correct.

No common prefix at all. If the first characters already differ (like ["dog", "racecar", "car"]), the function returns "" on the very first iteration. The check at i = 0 finds that 'd' != 'r' and returns strs[0][:0].

From understanding to recall

You can read this solution and follow every step. But reading is not the same as being able to reproduce it from scratch two weeks from now. The details that slip away are the small ones: iterating over the first string as your reference, the i >= len(s) boundary check, returning strs[0][:i] to slice up to but not including the mismatch position.

Spaced repetition fixes this. Instead of re-reading the full solution, you drill the individual building blocks (column comparison, early return on mismatch) at increasing intervals. Each review takes a minute or two, and after a few reps the pattern is locked in long-term memory. When you sit down for an interview and see a prefix or string comparison problem, the pieces are already there.

Related posts