Longest Happy Prefix: KMP Failure Function
LeetCode 1392, Longest Happy Prefix, gives you a string s and asks you to return the longest happy prefix. A happy prefix is a non-empty prefix of s that is also a suffix of s, excluding the entire string itself. If no such prefix exists, return an empty string.
Examples:
s = "level"returns"l"(the prefix"l"equals the suffix"l")s = "ababab"returns"abab"(the prefix"abab"equals the suffix"abab")s = "a"returns""(no proper prefix can also be a suffix)
Why this problem matters
This problem is a direct application of the KMP (Knuth-Morris-Pratt) preprocessing step. The KMP algorithm is one of the most important string matching algorithms, and its power comes from a preprocessing phase that computes a "failure function" (also called the LPS array, for longest proper prefix which is also suffix). That preprocessing step is exactly what this problem asks you to compute.
Understanding the failure function unlocks a family of string problems. Once you can build the LPS array, you can solve pattern matching in O(n + m), detect repeated patterns, find palindromic prefixes, and more. This problem isolates the preprocessing step so you can focus on understanding it deeply before applying it to harder problems.
The key insight
The KMP failure function computes, for each position i in the string, the length of the longest proper prefix of s[0..i] that is also a suffix of s[0..i]. The value at the last index of the LPS array is exactly the length of the longest happy prefix.
You build this array with two pointers. One pointer i scans through the string from left to right. The other pointer j tracks how much of the prefix currently matches. When characters match, both advance. When they do not match, you fall back by setting j = lps[j - 1], reusing work you have already done. This fallback is what makes the algorithm linear: you never rescan characters.
The solution
def longest_prefix(s: str) -> str:
n = len(s)
lps = [0] * n
j = 0
for i in range(1, n):
while j > 0 and s[i] != s[j]:
j = lps[j - 1]
if s[i] == s[j]:
j += 1
lps[i] = j
return s[:lps[-1]]
The lps array is initialized to all zeros. We start i at 1 because lps[0] is always 0 (a single character has no proper prefix that is also a suffix). For each position i, we try to extend the current match by comparing s[i] with s[j]. If they match, we increment j and record lps[i] = j. If they do not match, we fall back: j = lps[j - 1]. This moves j to the next best prefix that could still match, without rescanning any characters. We keep falling back until either we find a match or j reaches 0.
After processing the entire string, lps[n - 1] holds the length of the longest proper prefix that is also a suffix. We return the first lps[n - 1] characters of s.
The failure function is the heart of KMP. Once you have built the LPS array for a pattern, you can use it to search for that pattern in any text in O(n + m) time. The same fallback logic that avoids rescanning during preprocessing also avoids rescanning during the search phase.
Visual walkthrough
Building the failure function (lps array) for s = "ababab"
Step 1: Initialize
Top row: string · Bottom row: lps array
lps[0] is always 0. Start comparing s[1] with s[0].
Step 2: Compare s[1]='b' with s[0]='a'
Top row: string · Bottom row: lps array
'b' does not equal 'a', and j is already 0. Set lps[1] = 0. Move i to 2.
Step 3: Compare s[2]='a' with s[0]='a'
Top row: string · Bottom row: lps array
'a' equals 'a'. Set lps[2] = 1. Increment both i and j.
Step 4: Compare s[3]='b' with s[1]='b'
Top row: string · Bottom row: lps array
'b' equals 'b'. Set lps[3] = 2. Increment both i and j.
Step 5: Compare s[4]='a' with s[2]='a'
Top row: string · Bottom row: lps array
'a' equals 'a'. Set lps[4] = 3. Increment both i and j.
Step 6: Compare s[5]='b' with s[3]='b'
Top row: string · Bottom row: lps array
'b' equals 'b'. Set lps[5] = 4. The final value lps[5] = 4 is the answer length.
Result: lps[5] = 4
Top row: string · Bottom row: lps array
The longest happy prefix has length 4: s[0:4] = "abab".
Notice how j tracks the length of the current matching prefix. When a mismatch occurs, instead of starting over, we jump j back to lps[j - 1]. This reuses the fact that we already know a shorter prefix matches. In the example above, every comparison is a match, so j grows steadily from 0 to 4. The final value lps[5] = 4 tells us the answer is s[0:4] = "abab".
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| KMP failure function | O(n) | O(n) |
Time is O(n). The outer loop runs n - 1 times. The inner while loop may run multiple times on a single iteration, but across the entire algorithm, j can only be incremented n - 1 times total (once per outer iteration). Since each decrement in the while loop reduces j, and j never goes below 0, the total number of decrements across all iterations is at most n - 1. This amortized analysis gives O(n) total.
Space is O(n) for the LPS array. No other data structures are needed.
The building blocks
1. KMP failure function construction
The core loop that builds the LPS array by comparing characters and falling back on mismatch:
lps = [0] * n
j = 0
for i in range(1, n):
while j > 0 and s[i] != s[j]:
j = lps[j - 1]
if s[i] == s[j]:
j += 1
lps[i] = j
This pattern appears in every KMP-based solution. The while loop with j = lps[j - 1] is the key piece. It says: "if the current character does not extend the match, fall back to the next longest prefix that could work." You will use this exact loop for pattern matching, shortest palindrome computation, and repeated substring detection.
2. Prefix-suffix extraction from the LPS array
Once the LPS array is built, extracting the answer is a single lookup:
return s[:lps[-1]]
The last entry in the LPS array gives the length of the longest proper prefix that is also a suffix. This single value is all you need. The same extraction applies in problems like Shortest Palindrome, where lps[-1] tells you how much of the string is already a palindromic prefix.
Edge cases
- Single character:
s = "a". The LPS array is[0]. No proper prefix exists. Return"". - No happy prefix:
s = "abcde". No prefix matches any suffix. Every LPS value is 0. Return"". - Entire string repeats:
s = "aaa". The LPS array is[0, 1, 2]. The longest happy prefix is"aa", which is both a prefix and suffix. - Two-character string:
s = "aa"returns"a".s = "ab"returns"". - Long repeating pattern:
s = "abcabcabc". The LPS array ends with 6, so the answer is"abcabc".
From understanding to recall
The KMP failure function is one of those algorithms that makes perfect sense when you read through it, but trips you up when you try to write it from scratch. The details that matter are precise: initializing j = 0, starting i at 1, the while loop condition j > 0 and s[i] != s[j], and the fallback j = lps[j - 1]. Get any one of those wrong and the algorithm breaks silently.
Spaced repetition turns understanding into recall. You write the LPS construction loop from memory at increasing intervals. After a few reps, the two-pointer dance and the fallback logic become automatic. When an interview problem requires KMP preprocessing, you do not need to rederive the algorithm. You just write it.
Related posts
- Find the Index of the First Occurrence in a String - The classic string matching problem that KMP was designed for
- Shortest Palindrome - Uses the KMP failure function to find the longest palindromic prefix
- Repeated Substring Pattern - Another problem where the failure function reveals string structure
CodeBricks breaks the KMP failure function into its construction and extraction building blocks, then drills them independently with spaced repetition. You type the LPS loop from scratch until the fallback logic is automatic. When a string problem requires prefix-suffix analysis, you do not think about it. You just write it.