字符串分类 - hash

链接:https://www.nowcoder.com/acm/contest/141/E
来源:牛客网

题目描述

Eddy likes to play with string which is a sequence of characters. One day, Eddy has played with a string S for a long time and wonders how could make it more enjoyable. Eddy comes up with following procedure:

1. For each i in [0,|S|-1], let S i be the substring of S starting from i-th character to the end followed by the substring of first i characters of S. Index of string starts from 0.
2. Group up all the S i. S i and S j will be the same group if and only if S i=S j.
3. For each group, let L j be the list of index i in non-decreasing order of S i in this group.
4. Sort all the L j by lexicographical order.

Eddy can't find any efficient way to compute the final result. As one of his best friend, you come to help him compute the answer!

输入描述:

Input contains only one line consisting of a string S.

1≤ |S|≤ 10
6

S only contains lowercase English letters(i.e.
).

输出描述:

First, output one line containing an integer K indicating the number of lists.
For each following K lines, output each list in lexicographical order.
For each list, output its length followed by the indexes in it separated by a single space.
示例1

输入

复制
abab

输出

复制
2
2 0 2
2 1 3
示例2

输入

复制
deadbeef

输出

复制
8
1 0
1 1
1 2
1 3
1 4
1 5
1 6
1 7

题意 : 给一个字符串,要求每个位置为开始的子串会有多少种不同的情况,将不同的情况分类,按类输出
思路分析:
  将字符串延长一倍,预处理一遍 hash 值,任意一个子串的 hash 值就可以 O(1)的得到了,然后将 hash 值相同的串分类输出即可
代码示例:
using namespace std;
#define ll unsigned long long
const ll maxn = 1e6+5;
typedef pair<ll, ll>pa;

char s[maxn*2];
ll len, len2;
ll p = 19873;
ll hash_[maxn*2];

ll pp[maxn];
void init(){
    //printf("-------------------\n");
    pp[0] = 1;
    for(ll i = 1; i <= len; i++){
        pp[i] = pp[i-1]*p;
        //printf("------ %llu \n", pp[i]);    
    }
    
}

void gethash(){
    
    for(ll i = 1; i <= len2; i++){
        hash_[i] = hash_[i-1]*p+(s[i]-'a');
    }
}
vector<ll>ve[maxn];
pa pre[maxn];
pa arr[maxn];

int main() {
    //freopen("in.txt", "r", stdin);
    //freopen("out.txt", "w", stdout);
    
    scanf("%s", s+1);
    len = strlen(s+1);
    init();
    for(ll i = 1; i <= len; i++) s[i+len] = s[i];
    len2 = len*2;
    
    gethash();
    ll k = 1; 
    for(ll i = len; i < len2; i++){
        ll num = hash_[i]-hash_[i-len]*pp[len];
        
        pre[k++] = make_pair(num, i-len);        
    }    
    sort(pre+1, pre+1+len);
    pre[k] = make_pair(-1, 0); 
    k = 1;
    for(int i = 1; i <= len; i++){
        ve[k].push_back(pre[i].second);
        arr[k] = make_pair(ve[k][0], k);
        while(pre[i+1].first == pre[i].first){
            ve[k].push_back(pre[i+1].second);    
            i++;
        }
        k++;
    }
    sort(arr+1, arr+k);
    printf("%llu\n", k-1);
    
    for(ll i = 1; i < k; i++){
        ll x = arr[i].second;
        printf("%llu ", ve[x].size());
        for(ll j = 0; j < ve[x].size(); j++) 
            printf("%llu%c", ve[x][j], j==ve[x].size()-1?'\n':' ');
    }
    return 0;
}


猜你喜欢

转载自www.cnblogs.com/ccut-ry/p/9381123.html