Algorithm Series Four Heap Sort

Heap sort is also a more commonly used sorting algorithm. It is implemented based on the priority queue of pairs. To learn heap sorting, you must learn about priority queues, and the properties of heaps.

priority queue

In many application scenarios, we need the data to be ordered, but we don't need them all in order, or they don't need to be ordered at one time, and the ordered data is gradually needed.

For example: in some event systems, events have priority. When a program processes an event, it only needs to know which event has the highest current priority, and does not need to care about the order of all the queued events.

The priority queue supports two operations:

  1. remove the largest (or smallest) element
  2. insert element

According to the above characteristics, we can implement a sorting algorithm by putting a list of elements into the priority queue and removing the smallest elements one by one. 堆排序This algorithm is implemented based on the priority queue of the heap.

A priority queue is a set of interface definitions that can be implemented in many ways. We briefly introduce its interface, and then introduce the basic implementation and heap-based implementation.

interface

  1. insert element
  2. Remove and get the largest (smallest) element
  3. Is the queue empty

accomplish

The complexity comparison of various implementations:

Method to realize Insertion complexity remove complexity
sorted array N 1
No need for array 1 N
heap lgN lgN
ideal state 1 1

array implementation

1. Orderly implementation
  • When inserting a new element, move the element larger than it back one space (same as insertion sort). This ensures that the array is sorted after each insertion.
  • When deleting an element, just delete the last one.
2. Out-of-order implementation
  • When inserting a new element, it is inserted at the end of the queue.
  • To delete an element, run a loop to find the largest number from all the numbers and delete it.

heap implementation

The data structure 二叉堆can well implement the basic operations of the priority queue.

definition

When a binary tree, the value of each node is greater than or equal to (or less than) its child nodes, we call the binary tree heap ordered.

二叉堆is a set of complete binary trees that can be sorted by heap.

A complete binary tree can be represented by an array instead of a linked list of pointers. Put the nodes of the binary tree into the array in hierarchical order, such as: the root node is placed at position 1, the position of the child node is placed at 2, 3, the child nodes of the child node are placed at 4, 5 and 6, 7, to And so on.

Using an array to store a binary heap conforms to a rule (assuming the root node is at 1), when the node's position is k, its parent node is k/2, and its child nodes are 2k and 2k+1.

Tips: The position of array 0 is not used here. Of course it can be used, but it will make the calculation of child and parent nodes less intuitive. And in practical applications, the 0 node can be used as a sentinel to solve the judgment problem in many loops.

The parent node is: (k-1)/2; the child node: 2k+1, 2k+2;

With this rule, a binary tree can be placed in an array, as shown in the following figure:

When the root node is the largest binary heap, it is called the max heap; otherwise, it is called the min heap.

Here, we discuss max heap.

heap algorithm

Among the operations of the heap, there are two operations that are the lowest level:

  1. If a new node is added to the bottom of the heap, the order of the heap is destroyed. We need to restore the order of the heap from bottom to top. This operation is generally called heap_swim (meaning: heap elements float up)
  2. If the element of the root node is replaced with a smaller element, we need to restore the order of the heap from top to bottom. This operation is generally called heap_sink (meaning: heap elements sink)

Other operations on the heap depend on these two operations. Let's understand these two first, and the rest is much simpler.

Heap ordering from top to bottom

Call the node to be moved k.

Algorithmic process:

  1. Compare k with the larger of its two child nodes, and if it is less than the child node, swap positions with the child node
  2. Repeat the above steps until its children are smaller than him
  3. heap ordered

As shown below:

-c 400

The implementation is as follows:

void _heap_sink(int *a, int len, int k){

    while( 2*k <= len){

        int j = 2*k;
        if (j < len && a[j] < a[j+1]) ++j;
        if (a[k] >= a[j]) break;
        swap(a, k, j);
        k = j;
    }
}
Heap ordering from bottom to top

Call the node to be moved k.

Algorithmic process:

  1. Compare k with its parent node, and if it is greater than the parent node, swap positions with the parent node
  2. Repeat the above steps until its parent is larger than it
  3. heap ordered

As shown below:

void _heap_swim(int *a, int len, int k){

    while(k > 1 && a[k/2] < a[k]){
        swap(a, k/2, k);
        k = k/2;
    }
}
heap initialization

Initializing an existing out-of-order array to heap-ordered is very simple. There are two ways:

The first use is _heap_swimto scan the array from left to right. Make each array of the array a floating heap ordering, and when the scan ends, the heap ordering is completed.

The second can reduce some operations. Just use _heap_sinkscan half of the elements from right to left. Because we can skip sub-heaps of size 1.

Tips: According to the characteristics of complete binary tree. Assuming that the total number of nodes is N, when N is an odd number; the number of leaf nodes is: N/2, and if N is an even number, the number of leaf nodes is: N/2+1. Therefore, regardless of whether N is odd or even, the number of non-leaf nodes is N/2.

The implementation is as follows:
```cpp
void heap_init(int *a, int len){

for(int k = len /2; k >= 1; k--){
    _heap_sink(a, len, k);
}

}
```

stack insert element

The parameter capindicates the maximum number of elements that the queue can carry The
parameter lenindicates the current length of the queue

int heap_insert(int *a, int len, int cap, int v){

    if (len +1 > cap) {
        return len;
    };

    a[len+1] = v;
    _heap_swim(a, len, len+1);

    return len+1;
}
Remove the largest element from the heap

Removed and returned the largest element value in the heap. At this point the length of the queue should be decremented by one.

int heap_del_max(int *a, int len) {

    int tmp = a[1];
    swap(a, 1, len);
    _heap_sink(a, len-1, 1);
    return tmp;

}

heap sort

Above, we introduced the meaning and interface of priority queue. The heap-implemented priority queue is also covered in detail. Relying on the above knowledge, it is very simple to do heap sorting.

If we initialize an unordered array into a max heap, then every time we remove an element from the heap it is ordered (large to small). We put the extracted elements back into the array from right to left, is it sorted?

It is also very simple to implement:

void heap_sort(int *a, int len)
   
    heap_init(a, len);

    for(int i = len; i > 0; --i){
        heap_del_max(a, i);
    }
}

Here's a little trick: we don't need to assign values ​​to the original array, because we heap_del_maxswapped the largest number to the end of the queue, just to meet our needs.

end

Heap sort is actually very simple. The hardest part of it is understanding 二叉堆, and understanding binary heaps requires understanding binary trees. Due to space limitations, the characteristics of binary trees are not discussed here. In subsequent articles, the characteristics of binary trees will be described in detail.


Author and source ( reposkeeper ) authorized to share By CC-SA 4.0 Creative Commons License

Follow the WeChat public account to get the push of new articles!
qrcode

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324966405&siteId=291194637