AC inductive learning automata

Yesterday learned AC automaton, a little feeling, write it down later encountered the same problem there is a reference place.
AC automatic machine is suitable for multi-string pattern matching problem, the main features of the algorithm is only one parent string go all the answers can be drawn, are off-line algorithm. The core algorithm to transform the trie, think AC automaton belongs to a data structure, a side of the trie by the relationship between the prefix and suffix strings, and the same pattern strings to achieve different modes need not be repeated traversal parent the purpose of the string, the same number of weaknesses and dictionaries, faced large data is likely to cause memory overflow.
The above is my understanding of algorithm ideas, here are some of my understanding of the code

void insert(string a)
{
	int u=0;
	for(int i=0;i<a.size();i++)
	{
		int v=a[i]-'a';
		if(!tree[u][v])
		tree[u][v]=++cnt;
		u=tree[u][v];
	}
	book[u]++;
}

Trie simple insertion operation, needless to say;

void getfail()
{
	queue<int> que;
	for(int i=0;i<26;i++)
	if(tree[0][i])
	que.push(tree[0][i]);
	while(!que.empty())
	{
		int now=que.front();
		que.pop();
		for(int i=0;i<26;i++)
		{
			if(tree[now][i])
			{
				fail[tree[now][i]]=tree[fail[now]][i];
				que.push(tree[now][i]);
			}
			else
			tree[now][i]=tree[fail[now]][i];
		}
	}
}

On top of the dictionary tree recurrence by side operations, because the array is initialized to 0, pointing to the empty side of the point to the root node, personally I think that the essence of this operation is

tree[now][i]=tree[fail[now]][i];

If fail the array is stored virtual edge, then this operation is the establishment of a real edge, paving the way for the transfer of state. This step also put the tree into a dictionary "dictionary map", may form a ring.

int getans(string a)
{
	int ans=0;
	int now=0;
	for(int i=0;i<a.size();i++)
	{
		now=tree[now][a[i]-'a'];
		for(int j=now;j&&book[j]!=-1;j=fail[j])
		{
			ans+=book[j];
			book[j]=-1;
		}
	}
	return ans;
}

Get answers operations: now come to a pointer to the current place, this function can clearly see that tree is used for an array of state transitions, fail arrays are only used to expand the answer, do not affect the current status.
Here is the complete code should be noted that, in general AC automatic card machine is very time, string, cincout best not to use even if it is added iOS :: sync_with_stdio (false) also may time out (Gou thing)

#include <iostream>
#include <cstring>
#include <queue>
#define MAX 1000010
using namespace std;
int tree[MAX][26]; //字典树
int book[MAX];     //标记数组
int fail[MAX];     //增(虚)边数组
int cnt;
void init()
{
	cnt=0;
	memset(fail,0,sizeof(fail));
	memset(book,0,sizeof(book));
	memset(tree,0,sizeof(tree));
}

void insert(string a)
{
	int u=0;
	for(int i=0;i<a.size();i++)
	{
		int v=a[i]-'a';
		if(!tree[u][v])
		tree[u][v]=++cnt;
		u=tree[u][v];
	}
	book[u]++;
}

void getfail()
{
	queue<int> que;
	for(int i=0;i<26;i++)
	if(tree[0][i])
	que.push(tree[0][i]);
	while(!que.empty())
	{
		int now=que.front();
		que.pop();
		for(int i=0;i<26;i++)
		{
			if(tree[now][i])
			{
				fail[tree[now][i]]=tree[fail[now]][i];
				que.push(tree[now][i]);
			}
			else
			tree[now][i]=tree[fail[now]][i];
		}
	}
}

int getans(string a)
{
	int ans=0;
	int now=0;
	for(int i=0;i<a.size();i++)
	{
		now=tree[now][a[i]-'a'];
		for(int j=now;j&&book[j]!=-1;j=fail[j])
		{
			ans+=book[j];
			book[j]=-1;
		}
	}
	return ans;
}


int main()
{
	init();
	string a;
	while(cin>>a,a!="end")
	insert(a);
	cin>>a;
	fail[0]=0;
	getfail();
	cout<<getans(a);
	return 0;
}

Utilization of the data structure is a very important capability, provided that the master data structure construction principles and functions of its members.
There is a long way ahead, come on!

Published 30 original articles · won praise 9 · views 1314

Guess you like

Origin blog.csdn.net/Zhang_sir00/article/details/100089755