浅谈并查集(HDU1232 畅通工程/POJ1611The Suspects/POJ1988 Cube Stacking)

英文:Disjoint Set,即“不相交集合”。

将编号分别为1...N的N个对象划分为不相交的集合,在每个集合中,选择其中某个元素代表所有集合。

常见两种操作:

  • 两个集合
  • 找某个元素属于哪个集合

所以,也称“并查集”

例如,下面这个数组,第一行用i表示,第二行set(i)                                                                                                                                       

我们把set一样的i看成是一个集合,每个集合用一颗有根树表示:

//查找set(i)
find2(x)
{
   r = x;
   while (set[r] != r)
      r = set[r];
   return r;
}
//将a,b连起来
merge2(a, b)
{
    if (a<b)
       set[b] = a;
    else
       set[a] = b;
}

 下面看一道类似的题:

HDU1232 畅通工程

某省调查城镇交通状况,得到现有城镇道路统计表,表中列出了每条道路直接连通的城镇。省政府“畅通工程”的目标是使全省任何两个城镇间都可以实现交通(但不一定有直接的道路相连,只要互相间接通过道路可达即可)。问最少还需要建设多少条道路?

Input

测试输入包含若干测试用例。每个测试用例的第1行给出两个正整数,分别是城镇数目N ( < 1000 )和道路数目M;随后的M行对应M条道路,每行给出一对正整数,分别是该条道路直接连通的两个城镇的编号。为简单起见,城镇从1到N编号。
注意:两个城市之间可以有多条道路相通,也就是说
3 3
1 2
1 2
2 1
这种输入也是合法的
当N为0时,输入结束,该用例不被处理。

Output

对每个测试用例,在1行里输出最少还需要建设的道路数目。

Sample Input

4 2
1 3
4 3
3 3
1 2
1 3
2 3
5 2
1 2
3 5
999 0
0

Sample Output

1
0
2
998

 这道题刚好可以完美的诠释并查集的作用。把每个城镇都连起来,被连起来的城镇可以看成是一个城镇,最后看剩余多少个城镇数,减去1,就是是答案。那么怎么看剩余多少个城镇呢?定义一个数组road[],road[i]==i就是剩余的城镇数。

具体代码如下:

#include<iostream>
#define N 1000+10
using namespace std;
int n,m;
int road[N];
void init()
{
	for(int i=1;i<=n;i++)
		road[i]=i;
}
int find(int x)
{
	int r=x;
	while(road[r]!=r)
		r=road[r];
	return r;	
}
void U(int x,int y)
{
	int fx=find(x),fy=find(y);
	if(fx!=fy)
		road[fx]=fy;
	
}
int main()
{
	while(cin>>n&&n)
	{
		init();
		cin>>m;
		while(m--)
		{
			int x,y;
			cin>>x>>y;
			U(x,y);
		}
		int sum=-1;
		for(int i=1;i<=n;i++)
			if(road[i]==i)
				sum++;
		cout<<sum<<endl;
	}
	
}

上述的方法有一个缺点:如果这个有根数只有一个叶子,也就是一条直线的树,那么从叶子查找到根就需要循环很多次,很浪费时间,所以并查集还有升级版——路径压缩:让每次find到的数直接等于根,也就是把直线的树,变成了只有一个根和叶子的树。

第一步:找到根节点

第二步:修改查找路径上的所有节点,将它们都指向根节点。

find(int x)
{
    int r=x;
    while(set[r]!=r)
        r=str[r];
    i=x;
    while(i!=r)
    {
        int j=set[i];
        set[i]=r;
        i=j;
    }
}

示意图:

用递归的方法也能实现上述功能:

int find(int x)
{
	if(x==set[x]) return x;
	return set[x]=find(set[x]);
}

例题:

POJ1611The Suspects

Severe acute respiratory syndrome (SARS), an atypical pneumonia of unknown aetiology, was recognized as a global threat in mid-March 2003. To minimize transmission to others, the best strategy is to separate the suspects from others.
In the Not-Spreading-Your-Sickness University (NSYSU), there are many student groups. Students in the same group intercommunicate with each other frequently, and a student may join several groups. To prevent the possible transmissions of SARS, the NSYSU collects the member lists of all student groups, and makes the following rule in their standard operation procedure (SOP).
Once a member in a group is a suspect, all members in the group are suspects.
However, they find that it is not easy to identify all the suspects when a student is recognized as a suspect. Your job is to write a program which finds all the suspects.

Input

The input file contains several cases. Each test case begins with two integers n and m in a line, where n is the number of students, and m is the number of groups. You may assume that 0 < n <= 30000 and 0 <= m <= 500. Every student is numbered by a unique integer between 0 and n−1, and initially student 0 is recognized as a suspect in all the cases. This line is followed by m member lists of the groups, one line per group. Each line begins with an integer k by itself representing the number of members in the group. Following the number of members, there are k integers representing the students in this group. All the integers in a line are separated by at least one space.
A case with n = 0 and m = 0 indicates the end of the input, and need not be processed.

Output

For each case, output the number of suspects in one line.

Sample Input

100 4
2 1 2
5 10 13 11 12 14
2 0 1
2 99 2
200 2
1 5
5 1 2 3 4 5
1 0
0 0

Sample Output

4
1
1

题目大意:第一行N,M。一共N个人,每个人的编号不一样。下有M行,每行表示一个团体,每行第一个数为K,表示团体有K个人,后面是每个人的编号。编号为0的人,表示有艾滋病,我们认为凡是跟0一个团体的,都可能有艾滋病,问可能有艾滋病的人有多少个。当N为1时,只剩一个人,就是编号为0的艾滋病人。

题解:用并查集算法把每个团体连起来,定义一个数组,记录每个团体有多少人,一个团体用其根做下标。

具体代码如下:

#include<iostream>
#define N 33000
using namespace std;
int num[N],per[N];
int a[N];
int n;
void init(){
    for(int i = 0; i < n ; ++i){
        per[i] = i;
        num[i] = 1;
    }
    return ;
}
int find(int x)
{
	if(x==per[x]) return x;
	return per[x]=find(per[x]);
}
void U(int x,int y)
{
	int fx=find(x),fy=find(y);
	if(fx!=fy)
	{
		per[fx]=fy;
		num[fy]+=num[fx];
	}
}
int main()
{
	int m;
	while(cin>>n>>m,n!=0)
	{
		if(n==1)
		{
			cout<<"1"<<endl;
			continue;
		}
		init();
		while(m--)
		{
			int k;
			cin>>k;
			int i;
			for(i=0;i<k;i++)
				cin>>a[i];
			for(i=0;i<k-1;i++)
				U(a[i],a[i+1]);
		}		
		int t=find(0);
		cout<<num[t]<<endl;
	}
	return 0;
}

上面两道题是考察并查集算法很简单的两道,下面一题难度有增加很多:

POJ1988 Cube Stacking

Farmer John and Betsy are playing a game with N (1 <= N <= 30,000)identical cubes labeled 1 through N. They start with N stacks, each containing a single cube. Farmer John asks Betsy to perform P (1<= P <= 100,000) operation. There are two types of operations:
moves and counts.
* In a move operation, Farmer John asks Bessie to move the stack containing cube X on top of the stack containing cube Y.
* In a count operation, Farmer John asks Bessie to count the number of cubes on the stack with cube X that are under the cube X and report that value.

Write a program that can verify the results of the game.

Input

* Line 1: A single integer, P

* Lines 2..P+1: Each of these lines describes a legal operation. Line 2 describes the first operation, etc. Each line begins with a 'M' for a move operation or a 'C' for a count operation. For move operations, the line also contains two integers: X and Y.For count operations, the line also contains a single integer: X.

Note that the value for N does not appear in the input file. No move operation will request a move a stack onto itself.

Output

Print the output from each of the count operations in the same order as the input file.

Sample Input

6
M 1 6
C 1
M 2 4
M 2 6
C 3
C 4

Sample Output

1
0
2

题目大意:堆箱子。第一行N,接下来N行。M 1 6表示把标号为1的箱子堆到标号为6箱子的上面,C 1 表示询问1下面有几个箱子。

题解:相信如果上面两道题都理解,对于这道题思路应该也没有!直接贴代码了。dist表示每个结点到根的距离,rank表示深度。自己看代码理解吧。。。

#include <iostream>
#include <cstdio>
#include <cstring>
#include <string>
#include <algorithm>
using namespace std;
const int maxn = 30000+10;
int fa[maxn];
int rank[maxn];
int dist[maxn];
void init(){
    for(int i = 0; i < maxn; i++){
        fa[i] = i;
        rank[i] = 1;
        dist[i] = 0;
    }
}
int find(int x){
    if(x != fa[x]){
        int t = fa[x];
        fa[x] = find(fa[x]);
        dist[x] += dist[t];
    }
    return fa[x];
}
 
 
int main(){
    int n;
    while(~scanf("%d",&n)){
        init();
        char op;
        while(n--){
            cin >> op;
            int a,b;
            if(op=='M'){
                scanf("%d%d",&a,&b);
                int faA = find(a);
                int faB = find(b);
                if(faA != faB){
                    fa[faB] = faA;
                    dist[faB] = rank[faA];
                    rank[faA] += rank[faB];
                }
            }else{
                scanf("%d",&a);
                int x = find(a);
                printf("%d\n",rank[x]-dist[a]-1);
            }
        }
    }
    return 0;
}

猜你喜欢

转载自blog.csdn.net/qq_42391248/article/details/81087417