Skip to content
← All posts

Longest Happy Prefix: KMP Failure Function

6 min read
leetcodeproblemhardstrings

LeetCode 1392, Longest Happy Prefix, gives you a string s and asks you to return the longest happy prefix. A happy prefix is a non-empty prefix of s that is also a suffix of s, excluding the entire string itself. If no such prefix exists, return an empty string.

Examples:

  • s = "level" returns "l" (the prefix "l" equals the suffix "l")
  • s = "ababab" returns "abab" (the prefix "abab" equals the suffix "abab")
  • s = "a" returns "" (no proper prefix can also be a suffix)
012345abababprefix "abab"suffix "abab"
In "ababab", the prefix "abab" (indices 0-3) equals the suffix "abab" (indices 2-5). The overlapping characters (indices 2-3) appear in both. The longest happy prefix is "abab".

Why this problem matters

This problem is a direct application of the KMP (Knuth-Morris-Pratt) preprocessing step. The KMP algorithm is one of the most important string matching algorithms, and its power comes from a preprocessing phase that computes a "failure function" (also called the LPS array, for longest proper prefix which is also suffix). That preprocessing step is exactly what this problem asks you to compute.

Understanding the failure function unlocks a family of string problems. Once you can build the LPS array, you can solve pattern matching in O(n + m), detect repeated patterns, find palindromic prefixes, and more. This problem isolates the preprocessing step so you can focus on understanding it deeply before applying it to harder problems.

The key insight

The KMP failure function computes, for each position i in the string, the length of the longest proper prefix of s[0..i] that is also a suffix of s[0..i]. The value at the last index of the LPS array is exactly the length of the longest happy prefix.

You build this array with two pointers. One pointer i scans through the string from left to right. The other pointer j tracks how much of the prefix currently matches. When characters match, both advance. When they do not match, you fall back by setting j = lps[j - 1], reusing work you have already done. This fallback is what makes the algorithm linear: you never rescan characters.

The solution

def longest_prefix(s: str) -> str:
    n = len(s)
    lps = [0] * n
    j = 0
    for i in range(1, n):
        while j > 0 and s[i] != s[j]:
            j = lps[j - 1]
        if s[i] == s[j]:
            j += 1
        lps[i] = j
    return s[:lps[-1]]

The lps array is initialized to all zeros. We start i at 1 because lps[0] is always 0 (a single character has no proper prefix that is also a suffix). For each position i, we try to extend the current match by comparing s[i] with s[j]. If they match, we increment j and record lps[i] = j. If they do not match, we fall back: j = lps[j - 1]. This moves j to the next best prefix that could still match, without rescanning any characters. We keep falling back until either we find a match or j reaches 0.

After processing the entire string, lps[n - 1] holds the length of the longest proper prefix that is also a suffix. We return the first lps[n - 1] characters of s.

The failure function is the heart of KMP. Once you have built the LPS array for a pattern, you can use it to search for that pattern in any text in O(n + m) time. The same fallback logic that avoids rescanning during preprocessing also avoids rescanning during the search phase.

Visual walkthrough

Building the failure function (lps array) for s = "ababab"

Step 1: Initialize

012345ababab0?????

Top row: string · Bottom row: lps array

lps[0] is always 0. Start comparing s[1] with s[0].

Step 2: Compare s[1]='b' with s[0]='a'

012345abababi=1j=000????

Top row: string · Bottom row: lps array

'b' does not equal 'a', and j is already 0. Set lps[1] = 0. Move i to 2.

Step 3: Compare s[2]='a' with s[0]='a'

012345abababi=2j=0000???

Top row: string · Bottom row: lps array

'a' equals 'a'. Set lps[2] = 1. Increment both i and j.

Step 4: Compare s[3]='b' with s[1]='b'

012345abababi=3j=10010??

Top row: string · Bottom row: lps array

'b' equals 'b'. Set lps[3] = 2. Increment both i and j.

Step 5: Compare s[4]='a' with s[2]='a'

012345abababi=4j=200120?

Top row: string · Bottom row: lps array

'a' equals 'a'. Set lps[4] = 3. Increment both i and j.

Step 6: Compare s[5]='b' with s[3]='b'

012345abababi=5j=3001230

Top row: string · Bottom row: lps array

'b' equals 'b'. Set lps[5] = 4. The final value lps[5] = 4 is the answer length.

Result: lps[5] = 4

012345ababab001234

Top row: string · Bottom row: lps array

The longest happy prefix has length 4: s[0:4] = "abab".

Notice how j tracks the length of the current matching prefix. When a mismatch occurs, instead of starting over, we jump j back to lps[j - 1]. This reuses the fact that we already know a shorter prefix matches. In the example above, every comparison is a match, so j grows steadily from 0 to 4. The final value lps[5] = 4 tells us the answer is s[0:4] = "abab".

Complexity analysis

ApproachTimeSpace
KMP failure functionO(n)O(n)

Time is O(n). The outer loop runs n - 1 times. The inner while loop may run multiple times on a single iteration, but across the entire algorithm, j can only be incremented n - 1 times total (once per outer iteration). Since each decrement in the while loop reduces j, and j never goes below 0, the total number of decrements across all iterations is at most n - 1. This amortized analysis gives O(n) total.

Space is O(n) for the LPS array. No other data structures are needed.

The building blocks

1. KMP failure function construction

The core loop that builds the LPS array by comparing characters and falling back on mismatch:

lps = [0] * n
j = 0
for i in range(1, n):
    while j > 0 and s[i] != s[j]:
        j = lps[j - 1]
    if s[i] == s[j]:
        j += 1
    lps[i] = j

This pattern appears in every KMP-based solution. The while loop with j = lps[j - 1] is the key piece. It says: "if the current character does not extend the match, fall back to the next longest prefix that could work." You will use this exact loop for pattern matching, shortest palindrome computation, and repeated substring detection.

2. Prefix-suffix extraction from the LPS array

Once the LPS array is built, extracting the answer is a single lookup:

return s[:lps[-1]]

The last entry in the LPS array gives the length of the longest proper prefix that is also a suffix. This single value is all you need. The same extraction applies in problems like Shortest Palindrome, where lps[-1] tells you how much of the string is already a palindromic prefix.

Edge cases

  • Single character: s = "a". The LPS array is [0]. No proper prefix exists. Return "".
  • No happy prefix: s = "abcde". No prefix matches any suffix. Every LPS value is 0. Return "".
  • Entire string repeats: s = "aaa". The LPS array is [0, 1, 2]. The longest happy prefix is "aa", which is both a prefix and suffix.
  • Two-character string: s = "aa" returns "a". s = "ab" returns "".
  • Long repeating pattern: s = "abcabcabc". The LPS array ends with 6, so the answer is "abcabc".

From understanding to recall

The KMP failure function is one of those algorithms that makes perfect sense when you read through it, but trips you up when you try to write it from scratch. The details that matter are precise: initializing j = 0, starting i at 1, the while loop condition j > 0 and s[i] != s[j], and the fallback j = lps[j - 1]. Get any one of those wrong and the algorithm breaks silently.

Spaced repetition turns understanding into recall. You write the LPS construction loop from memory at increasing intervals. After a few reps, the two-pointer dance and the fallback logic become automatic. When an interview problem requires KMP preprocessing, you do not need to rederive the algorithm. You just write it.

Related posts

CodeBricks breaks the KMP failure function into its construction and extraction building blocks, then drills them independently with spaced repetition. You type the LPS loop from scratch until the fallback logic is automatic. When a string problem requires prefix-suffix analysis, you do not think about it. You just write it.