Horspool (String Matching)

Description of Horspool

Assumation

 text string: the string where we want to locate the pattern string.

n: the length of the text string.

m: the length of the pattern string.

Working function

  • If the last character of the pattern string is matched to the i_{th} character of text string
    • If every character is matched to the ({i-m+1})_{th} ~ i_{th} character of text stringreturn i-m+1
    • Else
      • If one of the1_{th} ~ (m-1)_{th} character of pattern string is matched to the i_{th} character of text string: i=i+k (the k is the distance between the nearest matched character of pattern string and the i_{th}character of text string)
      • Else: i=i+m 
  • Else
    •  If one of the1_{th} ~ (m-1)_{th} character of pattern string is matched to the i_{th} character of text string: i=i+k (the k is the distance between the nearest matched character of pattern string and the i_{th}character of text string)

    • Else: i=i+m

For more convience to get the value of k, we can define a table, which is a map in c++.

For the pattern string BARBER, the content of table is as follow.

Character A B E R Other character
k 4 2 1 3 6

Implementation Code

// Horspool
#include <bits/stdc++.h>
using namespace std;
int Horspool(string text,string pattern){
    int text_len=text.size(),pattern_len=pattern.size();
    // Moving table: record the moving distance corresponding the letter with index in the pattern
    map<int,int>table;
    for(int i=0;i<256;i++) table[i]=pattern_len;
    // Calculate the moving distance and fill the table in
    for(int i=pattern_len-2;i>=0;i--){
        if(table[pattern[i]]==pattern_len){
            table[pattern[i]]=pattern_len-i-1;
        }
    }
    // Matching pattern string and text string
    int tp=pattern_len-1; // needle pointing to text string
    bool ismatch;
    while(tp<text_len){
        ismatch=true;
        for(int i=0;i<pattern_len;i++){
            if(pattern[i]!=text[tp-pattern_len+i+1]) ismatch=false;
        }
        if(ismatch)
            return tp-pattern_len+1;
        else
            tp+=table[text[tp]];
    }
    return -1;
}
int main(){
    // Text string
    string text="_BHALLOBHALLOHELLO";
    // Pattern string
    string pattern="OHELLO";
    cout<<Horspool(text,pattern)<<endl;
    system("pause");
    return 0;
}

猜你喜欢

转载自blog.csdn.net/qq_55126913/article/details/130504634
今日推荐