kmp c++

The KMP algorithm is a string searching algorithm that efficiently matches patterns in a given text using the concept of a "failure function." Here's a step-by-step explanation of how the KMP algorithm works in C++:

  1. Preprocessing: In this step, the algorithm constructs the failure function, also known as the "lps" (longest proper prefix which is also a suffix) array. The lps array stores the length of the longest proper prefix of a pattern that is also a suffix. This is done using a loop that iteratively compares characters of the pattern and updates the lps array accordingly.

  2. Pattern Matching: Once the lps array is constructed, the algorithm performs the pattern matching. It compares each character of the text with the corresponding character of the pattern. If a mismatch occurs, the algorithm utilizes the lps array to determine the next character to compare in the pattern, rather than starting from the beginning of the pattern. This avoids unnecessary comparisons.

  3. Implementation: To implement the KMP algorithm in C++, you would typically define a function, let's say "kmpSearch," that takes the text and pattern as input. Within this function, you would perform the preprocessing step to construct the lps array. Then, using a loop, you would iterate through the text and pattern, comparing characters and updating the index accordingly.

  4. Output: The function would return the index of the first occurrence of the pattern in the text. If no match is found, it would return a sentinel value such as -1.

Here is a sample code snippet in C++ that demonstrates the implementation of the KMP algorithm:

#include <iostream>
#include <vector>
using namespace std;

void computeLPS(string pattern, vector<int>& lps) {
    int length = 0;
    lps[0] = 0;
    int i = 1;

    while (i < pattern.length()) {
        if (pattern[i] == pattern[length]) {
            length++;
            lps[i] = length;
            i++;
        }
        else {
            if (length != 0) {
                length = lps[length - 1];
            }
            else {
                lps[i] = 0;
                i++;
            }
        }
    }
}

int kmpSearch(string text, string pattern) {
    int n = text.length();
    int m = pattern.length();

    vector<int> lps(m, 0);
    computeLPS(pattern, lps);

    int i = 0;
    int j = 0;

    while (i < n) {
        if (text[i] == pattern[j]) {
            i++;
            j++;
        }

        if (j == m) {
            return i - j;
        }

        else if (i < n && text[i] != pattern[j]) {
            if (j != 0) {
                j = lps[j - 1];
            }
            else {
                i++;
            }
        }
    }

    return -1;
}

int main() {
    string text = "ABABDABACDABABCABAB";
    string pattern = "ABABCABAB";

    int index = kmpSearch(text, pattern);

    if (index != -1) {
        cout << "Pattern found at index " << index << endl;
    }
    else {
        cout << "Pattern not found!" << endl;
    }

    return 0;
}

This code snippet demonstrates the KMP algorithm in C++, where it searches for the pattern "ABABCABAB" in the text "ABABDABACDABABCABAB" and outputs the index where the pattern is found.