《STL源码剖析》笔记-priority_queue、heap算法

上一篇：《STL源码剖析》笔记-stack、queue

priority_queue是一个带权值的队列，权值最高的自动排在最前面，默认排序从大到小，同时具有队列先进先出、没有迭代器的特性。priority_queue也是一种container adapter，底层容器默认为vector。默认底层容器为vector的原因是priority_queue中使用了heap相关算法(本篇后半部分会介绍)，这些算法中会大量用到迭代器的operator+，list和deque在这方面的效率较差。

priority_queue定义完整列表

// 排序方式less(从大到小)，也能指定为greater(从小到大)
template <class T, class Sequence = vector<T>, class Compare = less<typename Sequence::value_type> >
class  priority_queue {
public:
    typedef typename Sequence::value_type value_type;
    typedef typename Sequence::size_type size_type;
    typedef typename Sequence::reference reference;
    typedef typename Sequence::const_reference const_reference;
protected:
    Sequence c;             // 底层容器
    Compare comp;      // 元素比较标准
public:
    priority_queue() : c() {}
    explicit priority_queue(const Compare& x) : c(), comp(x) {}
 
	// make_heap使用了heap算法(下文会进行介绍)，对[first, last)之间的元素进行排序
    template <class InputIterator>
    priority_queue(InputIterator first, InputIterator last, const Compare& x)
        : c(first, last), comp(x) {
        make_heap(c.begin(), c.end(), comp);
    }
    template <class InputIterator>
    priority_queue(InputIterator first, InputIterator last)
        : c(first, last) {
        make_heap(c.begin(), c.end(), comp);
    }

    bool empty() const { return c.empty(); }
    size_type size() const { return c.size(); }
    const_reference top() const { return c.front(); }

    // push_heap和pop_heap使用了heap算法(下文会进行介绍)，对[first, last)之间的元素进行排序
    void push(const value_type& x) {
        __STL_TRY{
            c.push_back(x);
            push_heap(c.begin(), c.end(), comp);
        }
        __STL_UNWIND(c.clear());
    }
    void pop() {
        __STL_TRY{
            pop_heap(c.begin(), c.end(), comp);
            c.pop_back();
        }
        __STL_UNWIND(c.clear());
    }
};

priority_queue使用

#include <queue>
#include <xfunctional>

int main(int argc, char **argv)
{
    {
        int a[5] = { 1,5,3,2,4 };
        std::priority_queue<int> pqueue(a, a + 5);
        int count = pqueue.size();
        for (int i = 0; i < count; ++i)
        {
            std::cout << pqueue.top();
            pqueue.pop();
        }
    }

    std::cout << std::endl;

    {
        int a[5] = { 1,5,3,2,4 };
        std::priority_queue<int, std::vector<int>, std::greater<int>> pqueue(a, a + 5);
        int count = pqueue.size();
        for (int i = 0; i < count; ++i)
        {
            std::cout << pqueue.top();
            pqueue.pop();
        }
    }
    
    return 0;
}

// 输出结果
54321
12345

heap

heap不属于容器，它主要辅助priority_queue实现。priority_queue允许客端以任意的次序将元素push进容器，但是取出时一定是从优先权最高或最低的元素开始取，而binary max/min heap正好具备这种特性。

binary heap是一种完全二叉树，也就是说整棵树除了最底层的叶节点之外都是填满的。而最底层的节点从左到右不允许有空隙。
在这里插入图片描述

由于整棵树没有任何节点漏洞，所以就能够使用数组来存储所有节点。将数组的#0元素保留，那么当完全二叉树的某个节点位于数组的i处时，其左子节点必然位于2i处，右子节点必然位于2i+1处，父节点必然位于i/2处。通过这种规则，就能轻易地使用数组来表述完全二叉树，这种方法被称为隐式表述法（implicit representation）。

根据元素的排列顺序，heap可以分为max-heap和min-heap，前者是递减的（所有的父节点都大于子节点）后者是递增的。STL中默认为max-heap，但是可以通过模板参数来控制，方法和上文的priority_queue一样。

push_heap算法

push_heap算法的作用是将新加入在尾部的元素上溯到合适的位置：比较新节点和父节点，如果其值大于父节点，就进行父子对换，以此类推一直到根节点。

template <class RandomAccessIterator, class Distance, class T>
void __push_heap(RandomAccessIterator first, Distance holeIndex,
    Distance topIndex, T value) {
    Distance parent = (holeIndex - 1) / 2;
    while (holeIndex > topIndex && *(first + parent) < value) {
        *(first + holeIndex) = *(first + parent);
        holeIndex = parent;
        parent = (holeIndex - 1) / 2;
    }
    *(first + holeIndex) = value;
}

template <class RandomAccessIterator, class Distance, class T>
inline void __push_heap_aux(RandomAccessIterator first,
    RandomAccessIterator last, Distance*, T*) {
    // 新增元素位于尾部 = (last-first) - 1
    __push_heap(first, Distance((last - first) - 1), Distance(0), T(*(last - 1)));
}

// max降序版本，调用该函数之前，应该已经将新元素push_back到尾部
template <class RandomAccessIterator>
inline void push_heap(RandomAccessIterator first, RandomAccessIterator last) {
    __push_heap_aux(first, last, distance_type(first), value_type(first));
}

template <class RandomAccessIterator, class Distance, class T, class Compare>
void __push_heap(RandomAccessIterator first, Distance holeIndex,
    Distance topIndex, T value, Compare comp) {
    Distance parent = (holeIndex - 1) / 2;
    while (holeIndex > topIndex && comp(*(first + parent), value)) {
        *(first + holeIndex) = *(first + parent);
        holeIndex = parent;
        parent = (holeIndex - 1) / 2;
    }
    *(first + holeIndex) = value;
}

template <class RandomAccessIterator, class Compare, class Distance, class T>
inline void __push_heap_aux(RandomAccessIterator first,
    RandomAccessIterator last, Compare comp,
    Distance*, T*) {
    __push_heap(first, Distance((last - first) - 1), Distance(0),
        T(*(last - 1)), comp);
}

// 指定comp版本
template <class RandomAccessIterator, class Compare>
inline void push_heap(RandomAccessIterator first, RandomAccessIterator last,
    Compare comp) {
    __push_heap_aux(first, last, comp, distance_type(first), value_type(first));
}

pop_heap算法

pop_heap算法的作用是将最大/最小的元素移出，其实就是移走根节点(算法实现的是将该元素置于容器的尾部，需要再pop_back移除)。为了满足完全二叉树的规则，根节点被移除后树的最下层最右边的节点需要重新找到合适的位置。所以需要将它移到根节点后，进行下溯：左右子节点进行比较，并与较大的子节点互换，直到不再有子节点，最后进行上溯。(原书中解释有误：对左右子节点进行比较，和较大的交换，直到左右子节点都比较小或不再有子节点)。先下溯再上溯的方式和对比左右子节点的方式，比较次数是一样的，但是赋值次数先下溯再上溯会比较多。至于为什么采用先下溯再上溯的范方式，猜测是因为能够复用代码。

template <class RandomAccessIterator, class Distance, class T>
void __adjust_heap(RandomAccessIterator first, Distance holeIndex,
    Distance len, T value) {
    Distance topIndex = holeIndex;
    Distance secondChild = 2 * holeIndex + 2;   // 右子节点
    while (secondChild < len) {
        // 找到左右子节点中较大的一个
        if (*(first + secondChild) < *(first + (secondChild - 1)))
            secondChild--;
        // 互换，然后继续
        *(first + holeIndex) = *(first + secondChild);
        holeIndex = secondChild;
        secondChild = 2 * (secondChild + 1);
    } // 直到不存在右子节点

      // 如果尾部刚好为最后一个右子节点(被移除的节点)，那么说明只有左子节点
      // 进行互换
    if (secondChild == len) {
        *(first + holeIndex) = *(first + (secondChild - 1));
        holeIndex = secondChild - 1;
    }

    // 移动到最底层后，再进行上溯
    __push_heap(first, holeIndex, topIndex, value);
}

template <class RandomAccessIterator, class T, class Distance>
inline void __pop_heap(RandomAccessIterator first, RandomAccessIterator last,
    RandomAccessIterator result, T value, Distance*) {
    *result = *first;   // 将根节点移除(移到尾部)
    __adjust_heap(first, Distance(0), Distance(last - first), value);
}

template <class RandomAccessIterator, class T>
inline void __pop_heap_aux(RandomAccessIterator first,
    RandomAccessIterator last, T*) {
    __pop_heap(first, last - 1, last - 1, T(*(last - 1)), distance_type(first));
}

// 此处为max降序版本，同样也提供comp版本，此处不再列出
template <class RandomAccessIterator>
inline void pop_heap(RandomAccessIterator first, RandomAccessIterator last) {
    __pop_heap_aux(first, last, value_type(first));
}

sort_heap算法

sort_heap算法的作用是将整个heap进行排序(heap需要符合规则，否则可能会报错)，排序后呈递增状态，同时也不再符合heap规则。

// 此处为递增版本，同样也提供comp版本，此处不再列出
template <class RandomAccessIterator>
void sort_heap(RandomAccessIterator first, RandomAccessIterator last) {
    // pop会将根节点移到最后，这样就能把最大的元素向后排，最终形成递增
    while (last - first > 1) pop_heap(first, last--);
}

make_heap算法

make_heap算法的作用是将一段数据转化成符合规则的heap。

template <class RandomAccessIterator, class T, class Distance>
void __make_heap(RandomAccessIterator first, RandomAccessIterator last, T*, Distance*) {
    if (last - first < 2) return;    // 长度小于2，不需要转化

    // 从需要排序的第一个子树头部开始(第n-1层最后一个有子节点的元素)，进行最大值排序(先下溯再上溯)
    Distance len = last - first;
    Distance parent = (len - 2) / 2;
    while (true) {
        __adjust_heap(first, parent, len, T(*(first + parent)));
        if (parent == 0) return;
        parent--;
    }
}

// 此处为max降序版本，同样也提供comp版本，此处不再列出
template <class RandomAccessIterator>
inline void make_heap(RandomAccessIterator first, RandomAccessIterator last) {
    __make_heap(first, last, value_type(first), distance_type(first));
}

heap使用
以下举例说明heap几个算法的使用：

int main()
{
    std::vector<int> vec = { 1,6,2,3,4,5 };

    std::make_heap(vec.begin(), vec.end());
    for (auto i : vec) {
        std::cout << i;
    }
    std::cout << std::endl;

    vec.push_back(8);
    std::push_heap(vec.begin(), vec.end());
    for (auto i : vec) {
        std::cout << i;
    }
    std::cout << std::endl;

    std::pop_heap(vec.begin(), vec.end());
    vec.pop_back();
    for (auto i : vec) {
        std::cout << i;
    }
    std::cout << std::endl;

    // sort_heap算法传入的值必须是符合规则的heap，否则程序会出错
    std::sort_heap(vec.begin(), vec.end());
    for (auto i : vec) {
        std::cout << i;
    }
    std::cout << std::endl;

    return 0;
}

// 输出结果
645312
8463125
645312
123456

下一篇：《STL源码剖析》笔记-树的介绍

《STL源码剖析》笔记-priority_queue、heap算法

猜你喜欢