C 最小堆的构建 霍夫曼树的性质 05-树9 Huffman Codes (30分)

05-树9 Huffman Codes (30分)

In 1953, David A. Huffman published his paper “A Method for the Construction of Minimum-Redundancy Codes”, and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string “aaaxuaxz”, we can observe that the frequencies of the characters ‘a’, ‘x’, ‘u’ and ‘z’ are 4, 2, 1 and 1, respectively. We may either encode the symbols as {‘a’=0, ‘x’=10, ‘u’=110, ‘z’=111}, or in another way as {‘a’=1, ‘x’=01, ‘u’=001, ‘z’=000}, both compress the string into 14 bits. Another set of code can be given as {‘a’=0, ‘x’=11, ‘u’=100, ‘z’=101}, but {‘a’=0, ‘x’=01, ‘u’=011, ‘z’=001} is NOT correct since “aaaxuaxz” and “aazuaxax” can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.

Input Specification:
Each input file contains one test case. For each case, the first line gives an integer N (2≤N≤63), then followed by a line that contains all the N distinct characters and their frequencies in the following format:

c[1] f[1] c[2] f[2] ... c[N] f[N]

where c[i] is a character chosen from {‘0’ - ‘9’, ‘a’ - ‘z’, ‘A’ - ‘Z’, ‘_’}, and f[i] is the frequency of c[i] and is an integer no more than 1000. The next line gives a positive integer M (≤1000), then followed by M student submissions. Each student submission consists of N lines, each in the format:

c[i] code[i]

where c[i] is the i-th character and code[i] is an non-empty string of no more than 63 '0’s and '1’s.

Output Specification:
For each test case, print in each line either “Yes” if the student’s submission is correct, or “No” if not.

Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.

Sample Input:

7
A 1 B 1 C 1 D 3 E 3 F 6 G 6
4
A 00000
B 00001
C 0001
D 001
E 01
F 10
G 11
A 01010
B 01011
C 0100
D 011
E 10
F 11
G 00
A 000
B 001
C 010
D 011
E 100
F 101
G 110
A 00000
B 00001
C 0001
D 001
E 00
F 10
G 11

Sample Output:

Yes
Yes
No
No

题目
输入的第一行代表数的个数
第二行为给定的元素和权值
第三行为要判定的组数
后面接着每n行一个组

解题关键
1.建立最小堆,把第二行的数据逐个插入最小堆,再另取一个列表保留权值;
2.利用霍夫曼树的建立方法,把最小堆里的头元素两两组合再插回,执行n-1次,便将最小堆转化为霍夫曼树;
3.通过此霍夫曼树,求出最小权值
两部将题目给定的元素建立成了霍夫曼树
接下来要判断后面给的数是否为霍夫曼树
1.最小权值相同
字符串长度*给定字母的权值大小就是该路径的权值;
全部相加即为总权值,与已求权值比较即可;
2.不会出现前缀码重复
构建一个二叉树,若有值则标记;
后面路径路过此值即为前缀码重复;
后面路径终点不为叶节点,也为前缀码重复;


#include<iostream>
#include<cstring>
using namespace std;
#define MaxNum 64

struct TreeNode{     //树 
	int Weight =0;    //初始化为0
	TreeNode *Left = nullptr;
	TreeNode *Right = nullptr;   //左右为指向下一个tree对象的空指针 
};

struct HeapNode{// 堆结构
	TreeNode Data[MaxNum];      //data内放着树结构
	int Size =0;        //初始大小为0 
};

HeapNode * CreateHeap(){    //建树——返回指向树根节点的指针 
	HeapNode *H = new(HeapNode);
	H->Data[0].Weight = -1;    //哨兵的weight设置为-1
	return H;       //建堆——一次即可 
}

TreeNode *DeleteMin(HeapNode *H){ //返回堆定的指针 
	int Parent = 0 , Child=0;
	TreeNode temp;
	TreeNode *MinItem = new(TreeNode);       //新建一个minitem
	*MinItem = H->Data[1];   //保存堆顶
	temp = H->Data[(H->Size)--];   //拿到堆底的对象,并且size-1
	for(Parent=1;Parent *2 <= H->Size; Parent=Child)
	{
		Child= Parent*2;
		//Size就是为了作为边界 
		if(Child!=H->Size && (H->Data[Child].Weight>H->Data[Child+1].Weight))
			Child++;
		if(temp.Weight<=H->Data[Child].Weight)  break;
		else H->Data[Parent]=H->Data[Child];        //下沉 
	}
	//break出来说明找到合适位置 
	 H->Data[Parent] = temp;     //进行赋值
	 return MinItem;       //返回堆定成员指针 
} 

//插入
void Insert(HeapNode *H, TreeNode *item){     //将一个树成员插入堆中 
	int i=0;
	i= ++(H->Size);    //i为当前总容量
	//item为指向树成员的指针 
	for (;item ->Weight<H->Data[i/2].Weight;i/=2)  //插入最后,然后上浮 
	//当要插入的成员小于父节点,上浮 
	{
		H->Data[i]=H->Data[i/2];
	 } 
	 H->Data[i] = *item;       //将item所指成员放入堆中 
} 

// 
//数组存放value
//因为后面是按顺序给的
//所以顺序保存每个字母的权值,后面遍历数组就可以得到权值
//计算总权值比较方便 
HeapNode * ReadData(int N, HeapNode *H, int A[])
{                     //读取数据,存放至A,并且插入树中,循环N次 
	char s;
	int value;                  
	for(int i=0;i<N;++i)
	{
		cin>>s;
		cin>>value;
		A[i]=value;    
		TreeNode* T = new(TreeNode);    //新建动态内存,用到指针 
		T->Weight = value;
		Insert(H,T);           //insert指针更方便 
	}
	return H;        //返回堆头 
}

TreeNode *Huffman(HeapNode *H){    //把堆变为哈夫曼树 
	TreeNode *T = nullptr;     //初始空指针
	 int num = H->Size;      //一共有num个元素
	 //进行n-1次合并
	 for(int i=0;i<num-1;i++)
	 {
	 	T= new(TreeNode);
	 	T->Left = DeleteMin(H);
	 	T->Right = DeleteMin(H);
	 	T->Weight=T->Left->Weight + T->Right->Weight;
	 	Insert(H,T);    //T为指针——insert(H,T)
		 //要用动态内存创造空间时——用指针 
	  } 
	  T= DeleteMin(H);    
	  return T;      //返回指向堆头的指针 
	  // 
}

//计算树H的权值——n从0开始,遇到叶节点,返回值,遇到根节点,把左右加起来返回 
int WPL(TreeNode *H, int n)
{
	if(!H ->Right) return H->Weight * n;
	else return WPL(H->Left,n+1) +WPL(H->Right,n+1);
}

struct JNode{
	// 每个点非0即1 flag为1代表这里已经有一个前辈
	//住进去了只要我完全走了前辈走的路,就重复出现
	int flag=0;           //0表示不为叶节点,1表示为叶节点 
	JNode *Left=nullptr;
	JNode *Right =nullptr;
};

//判断编码是否符合前缀码要求
bool Judge(string s, JNode *J)
{
	int i=0;
	for (;i <s.length();++i)      //堆给定string的长度循环 
	{
		if(s[i]=='0'){  //遇到0 
			if(J->Left==nullptr){       //没有路 
				JNode *j = new(JNode);     //创建一个树结点 
				J->Left =j;            //0为把新建的树挂到左边 
			}
			else{        //如果有路
		        //这条路为叶节点 
			if(J->Left->flag==1) return false;  //则返回错误 
			//不为叶结点则什么都不做,继续走 
		}
		J=J->Left;        //走到左边的点 
		}
		else {      //遇到1 
			if(J->Right==nullptr){
				JNode *j=new(JNode);  //这个用法 
				J->Right = j;
			}
			else {
				if(J->Right ->flag==1) return false;
			}
			J=J->Right;
		}
	}
	//循环结束
	J->flag =1;     //将找到的节点设为1;
	if(J->Left==nullptr&&J->Right==nullptr){    //如果该点确实为叶节点——没有左右子树 
		return true;
	} 
	else return false;       
 } 

int main(){
	int N=0, n=0;
	cin>>N;
	HeapNode *H = CreateHeap();    //H指向一个新建堆
	int Value[MaxNum] = {};
	ReadData(N,H,Value);      //把N个数, 插入堆H, 保存至数组Value
	TreeNode *T = Huffman(H);      //把堆H按霍夫曼组合,得到树头,给T
	int CodeLen=WPL(T,0);      //从树头开始,计算权值
	cin>>n;     //读取要判断的组数个数
	string temp;
	char c;
	bool result;
	for(int i=0;i<n;++i) //n组 
	{
		int count=0;
		int Flag=0;
		JNode *J = new(JNode);          //新建一个树头 
		//这里写错,真的离谱 
		for(int j=0;j<N;++j){      //第j个数的权值放在value[j] 
			cin>>c>>temp;          //读入字符,字符串 
			count+=temp.length()*Value[j];   
			//Value存放 权值,temp的长度*权值就为该字母的权值 
			if(!Flag){
				result = Judge(temp, J);   //判断 
				if(!result){       //出现一行不符合前缀码要求 
					Flag=1;      //标记无需计算,读完即可 
				}
			}
		}
		delete J;       //释放J的内存
		if(result && count == CodeLen){     //权值相等,并且经过上面前缀码的判断 
			cout<<"Yes"<<endl;
		} 
		else{
			cout<<"No"<<endl;
		}
	 } 
	 return 0;
} 
发布了77 篇原创文章 · 获赞 3 · 访问量 3025

猜你喜欢

转载自blog.csdn.net/BLUEsang/article/details/105366743
今日推荐