POJ - 1743 Musical Theme(后缀自动机)

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/Cymbals/article/details/82785614

A musical melody is represented as a sequence of N (1<=N<=20000)notes that are integers in the range 1…88, each representing a key on the piano. It is unfortunate but true that this representation of melodies ignores the notion of musical timing; but, this programming task is about notes and not timings.
Many composers structure their music around a repeating &qout;theme&qout;, which, being a subsequence of an entire melody, is a sequence of integers in our representation. A subsequence of a melody is a theme if it:

  1. is at least five notes long
  2. appears (potentially transposed – see below) again somewhere else in the piece of music
  3. is disjoint from (i.e., non-overlapping with) at least one of its other appearance(s)
    is disjoint from (i.e., non-overlapping with) at least one of its other appearance(s)

Transposed means that a constant positive or negative value is added to every note value in the theme subsequence.
Given a melody, compute the length (number of notes) of the longest theme.
One second time limit for this problem’s solutions!
Input
The input contains several test cases. The first line of each test case contains the integer N. The following n integers represent the sequence of notes.
The last test case is followed by one zero.
Output
For each test case, the output file should contain a single line with a single integer that represents the length of the longest theme. If there are no themes, output 0.

Sample Input
30
25 27 30 34 39 45 52 60 69 79 69 60 52 45 39 34 30 26 22 18
82 78 74 70 66 67 64 60 65 80
0
Sample Output
5

Sam已经学了有一段时间了(一个月?),从完全懵逼到有浅显的理解再到基本不用思考就能画出字符串的Sam,看起来好像已经对Sam有了不错的了解,但是再遇到题的时候它永远还是那么难,永远还是让你觉得你完全没懂它。

我曾经在看懂了Sam的图之后喜悦的发布了一篇博客,叫“我终于学会了后缀自动机”,现在我每次看到这篇文章都忍不住想捂脸,如果有一个人声称掌握了Sam,他当然有可能是大牛,但是更可能是像我这样刚学Sam的愣头青。

不扯远,让我好好写一写这题的题解。

题目首先定义了一种新的数列相等判定方式:一个数列A如果每项都加上一个数可以变成另外一个数列B,那么A = B。比如数列1,2,3,4 = 11,12,13,14。

不好意思,从这里开始我就不会了,问了权哥才知道原来世界上还有差分这种操作。

将原数列预处理一下,将后一项与前一项做差,把它修改成s[i] = s[i + 1] - s[i] + 88,然后就可以直接判断两段数列是否相等了。

为什么会这样呢?

设数列A = 1,2,3,4;与它相等的数列B即是B = 1 + d,2 + d,3 + d,4 + d。

当我们做差的时候,即(2 + d) - (1 + d),运算完成之后d消掉了,也就是说A数列和B数列是差等价的,再加上88,是为了减少元素的种类(去掉负数)。

把原数列修改成差分数列之后,就可以快乐的建自动机了。

其实这就是一个经典的Sam解“最长不重叠重复子串”问题。

最长好办,记录最大值就可以了,在这里着重要解决的是“不重叠”和“重复(题目要求重复至少两次,多少次可以不管,直接判有没有重复就好了)”。

我们知道,在Sam中,每一个节点本质代表的是right集合等价的一类字符串,right集合本身就带有重复的性质——子串的出现次数和它从属的right集合大小相同。

所以这题的“重复”肯定可以从right集合上做文章。

还有一个问题是“不重叠”。这并不好办,很难直接想到Sam里到底有什么东西与字符串的重叠相对应。其实还是有的,那就是那个每次新建节点都会+ 1的step数组(这玩意儿名字还挺多的,在我这就叫step吧)。

解题思路是这样的:首先,我们知道后缀自动机有一条有名的性质,当前状态的right集合是其parent树上所有子节点的并集(不知道的请去hihocoder看Sam教程)。建一遍Sam得到的东西是有限的,但是我们可以自底向上更新,求出更多的right集合信息。

在这里,只要处理出“当前right集合里step最小值和step的最大值即可”。

用一个子串在原串中第N次出现的结束位置(最远那个,比他近的就不用考虑了),减去第一次出现的结束位置,如果结果大于其长度,即不重叠。

ac代码:

#include<algorithm>
#include<cstring>
#include<cstdio>
using namespace std;

const int maxn = 20005;
int n, s[maxn];

struct Sam {
	int next[maxn << 1][90];
	int link[maxn << 1], step[maxn << 1];
	int a[maxn << 1], b[maxn << 1];
	int maxx[maxn << 1], minn[maxn << 1];
	int sz, last;

	void init() {
		memset(next, 0, sizeof(next));
		memset(a, 0, sizeof(a));
		memset(b, 0, sizeof(b));
		memset(maxx, 0, sizeof(maxx));
		memset(minn, 127
		       , sizeof(minn));
		sz = last = 1;
	}

	void add(int c) {
		int p = last;
		int np = ++sz;
		last = np;

		step[np] = step[p] + 1;
		maxx[np] = minn[np] = step[np];
		while(!next[p][c] && p) {
			next[p][c] = np;
			p = link[p];
		}

		if(p == 0) {
			link[np] = 1;
		} else {
			int q = next[p][c];
			if(step[p] + 1 == step[q]) {
				link[np] = q;
			} else {
				int clone = ++sz;
				memcpy(next[clone], next[q], sizeof(next[q]));
				step[clone] = step[p] + 1;
				link[clone] = link[q];
				link[q] = link[np] = clone;
				while(next[p][c] == q && p) {
					next[p][c] = clone;
					p = link[p];
				}
			}
		}
	}

	void build() {
		init();
		for(int i = 1; i <= n; i++) {
			add(s[i]);
		}
		for(int i = 1; i <= sz; i++) {
			a[step[i]]++;
		}
		for(int i = 1; i <= n; i++) {
			a[i] += a[i - 1];
		}
		for(int i = 1; i <= sz; i++) {
			b[a[step[i]]--] = i;
		}
		for(int i = sz; i > 1; i--) {
			int e = b[i];
			maxx[link[e]] = max(maxx[link[e]], maxx[e]);
			minn[link[e]] = min(minn[link[e]], minn[e]);
		}
	}

	void solve() {
		build();
		int ans = 0;
		for(int i = 2; i <= sz; i++) {
			ans = max(ans, min(maxx[i] - minn[i], step[i]));
		}
		printf("%d\n", ans < 4 ? 0 : ans + 1);
	}

} sam;

int main() {
	while(~scanf("%d", &n) && n) {
		for(int i = 1; i <= n; i++) {
			scanf("%d", &s[i]);
		}
		for (int i = 1; i < n; i ++) {
			s[i] = s[i + 1] - s[i] + 88;
		}
		n--;
		sam.solve();
	}
	return 0;
}

猜你喜欢

转载自blog.csdn.net/Cymbals/article/details/82785614