KMP Algorithm

https://www.geeksforgeeks.org/searching-for-patterns-set-2-kmp-algorithm/

For string pattern matching, brutal force way needs to match the start character at every possible index i. The worst case complexity of this Naive algorithm is O(m(n-m+1))

*m is the length of the pattern

KMP (Knuth Morris Pratt) Pattern Searching

It is O(n) space and O(n) worst case time complexity. The basic idea behind KMP’s algorithm is: whenever we detect a mismatch (after some matches), we already know some of the characters in the text of next window. We take advantage of this information to avoid matching the characters that we know will anyway match.

In the next step, we compare next window of txt with pat.
txt = "AAAAABAAABA" 
pat =  "AAAA" [Pattern shifted one position]
This is where KMP does optimization over Naive. In this 
second window, we only compare fourth A of pattern
with fourth character of current window of text to decide 
whether current window matches or not. Since we know 
first three characters will anyway match, we skipped 
matching first three characters.

To do this, we need the preprocessing to find out:

lps[i] = the longest proper prefix of pat[0..i] 
              which is also a suffix of pat[0..i].

Searching Algorithm:

Preprocessing Algorithm:

We need the lookup of lps[len-1] whenever there is a mismatch because of several scenarios:

AAACAAAA

ACDBACDE ACDBACDB

pat[] = "AAACAAAA"

len = 3, i  = 7.
Since pat[len] and pat[i] do not match and len > 0,
set len = lps[len-1] = lps[2] = 2

len = 2, i  = 7.
Since pat[len] and pat[i] match, do len++, 
store it in lps[i] and do i++.
len = 3, lps[7] = 3, i = 8
public int[] getKMPLookupTable(String s) {
    int[] lookup = new int[s.length()];

    lookup[0] = 0;
    int i=1, len = 0;
    while(i<s.length()) {
        if(s.charAt(len)==s.charAt(i)) {
            len++; lookup[i] = len;
            i++;
        } else if(len!=0) {
            len = lookup[len-1];
        } else {
            i++;
        }
    }

    return lookup;
}

Applications

Find Palindrome

We can use KMP proper prefix-suffix lookup to find the longest palindrome starting at index 0 of a string.

214. Shortest Palindrome

c a t a c b # b c a t a c

0 0 0 0 1 0 0 0 1 2 3 4 5

results matching ""

    No results matching ""