[huffman tree] Fast calculation of fast_wpl weighted path length

wpl: weighted path length, refers to the weighted path length.

        To find the weighted path length of a set of weights corresponding to the huffman tree, the most direct way is to construct the huffman tree first, and then calculate the sum of the weighted paths of all leaf nodes, namely:

gif.latex?wpl%28T%29%3D%5C%3B%20%5Csum_%7Bk%7D%5CL_%7Bk%7D*w_%7Bk%7D

        This solution requires us to construct a huffman tree first, then mark the depth corresponding to each leaf node, and then traverse all leaf nodes.

        But in fact, if we only need to solve wpl, we don't need to construct a huffman tree, let alone solve the depth and traverse the leaf nodes. It is only necessary to repeatedly add the minimum weights of the original node and the newly generated node repeatedly .

Give the code first: (the default weight array is ordered )

//默认权重数组w已经有序(从小到大)
//指针p指向当前最小权值的index,初始值给1
//n为数组w的规模
//权重数组w默认1为起始位
int ans=0;
int fastwpl(int w[],int p,int n)   
{
    if(n<=1) return 0;        //没有或只有一个节点,wpl为0;
	if (p>= n) return ans;    //当p指向最末位(根节点),求解结束
	else {
		int new_w = w[p] + w[p+1];  //取出两个最小权值,生成新权值
		ans += new_w;               
		//插入新权值节点//
        //查找插入位置
		int i = p+2;        //由于两个最小权值已取出,所以从p+2开始检索
		while (new_w > w[i] && i<=n ) i++;
		//在i之前插入新权值
		for (int j = p + 1; j < i-1; j++) w[j] = w[j + 1];
		w[i - 1] = new_w;
        //指针后移一位(每次运算,删除两个最小节点,生成一个新节点,故p+1)
		p++;
		fastwpl(w, p, n);
	}
}

Algorithm description (the huffman tree in the following figure is an example):                 

5af8b3a04d434f03917ae7e0bcfa58df.png

The most straightforward solution is:

                                        gif.latex?wpl%3D2*3&plus;4*3&plus;5*2&plus;7*1%3D35

The solution in this article is:

                ​​​​​​​        ​​​​​​​        ​​​​​​​        gif.latex?wpl%3D2&plus;4&plus;5&plus;6&plus;7&plus;11%3D35

That is to add up the weights of all non-root nodes.

        This solution seems unreasonable, it seems that it does not consider the path length of the node at all. In fact, the calculation of the "path length", that is, the depth of the leaf is implicit in the repeated summation process of the intermediate nodes.

For example, the path length of nodes 2 and 4 is 3, that is, the weighted path length is 2*3+4*3, which is equivalent to repeatedly adding 2 and 4 three times.

        In another way of thinking, when we add the new node 6 composed of 2 + 4, it is equivalent to performing a +2+4 operation on the pair.

That is +2+4+6=+2+4+(2+4)=+2*2+4*2.

        Similarly, in the next layer of operations, adding the new node 11 obtained by 6 + 5 is equivalent to adding 6 = 2 + 4 again.

        Therefore, in the process of continuously adding new nodes, it is equivalent to continuously iterating the path lengths of leaf nodes 2 and 4.

In general, we can conclude that:

       wpl=Generate the sum of all non-root node weights of the huffman tree

        Therefore, to calculate the minimum weighted path length of a set of weights, we only need to iteratively add leaf nodes and their newly generated weight nodes. In the case where the weight array is ordered, the complexity is O(n);

        As for sorting, we can use quick sort or merge sort .

 

 

 

Guess you like

Origin blog.csdn.net/m0_67441224/article/details/127479614