查找逆序数对的数量[中等难度算法]

[问题一]

统计满足这样条件的数对 ( a i , a j ) (a_i, a_j) 的个数:[1] i < j i < j ; [2] a i > a j a_i > a_j 。我们把这样的数对叫做颠倒的数对。

要解决这个问题首先要了解归并排序。归并排序将无序的数组不断拆分成小数组,让小数组先排好序,然后再把排好序的小数组合并成有序的大数组。

我们的重点是研究“合并”的过程。
归并两个有序数组
在合并的过程中 会有 i i j j 的比较,若 a i > a j a_i>a_j ,则将 a j a_j 拷贝到临时数组中然后 j j 右移,否则将 a i a_i 拷贝到临时数组中然后 i i 右移。

如果出现一个 a i > a j a_i>a_j ,外加上 i < j i<j 必然成立,所以就出现了一个颠倒的数对。不仅如此, i i 右边的数都比 a j a_j 大,这些右边的数都与 a j a_j 构成了颠倒数对。(这里描述的事情在代码的 this is tricky 处)

代码如下:

#include <iostream>
using namespace std;

int _mergeSort(int arr[], int temp[], int left, int right);
int merge(int arr[], int temp[], int left, int mid, int right);

/* This function sorts the input array and returns the
number of inversions in the array */
int mergeSort(int arr[], int array_size)
{
    int temp[array_size];
    return _mergeSort(arr, temp, 0, array_size - 1);
}

/* An auxiliary recursive function that sorts the input array and
returns the number of inversions in the array. */
int _mergeSort(int arr[], int temp[], int left, int right)
{
    int mid, inv_count = 0;		// inv_count是颠倒数对计数
    if (right > left) {
        /* Divide the array into two parts and
        call _mergeSortAndCountInv()
        for each of the parts */
        mid = (right + left) / 2;

        /* Inversion count will be sum of
        inversions in left-part, right-part
        and number of inversions in merging */
        inv_count = _mergeSort(arr, temp, left, mid);
        inv_count += _mergeSort(arr, temp, mid + 1, right);

        /*Merge the two parts*/
        inv_count += merge(arr, temp, left, mid + 1, right);
    }
    return inv_count;
}

/* This funt merges two sorted arrays
and returns inversion count in the arrays.*/
int merge(int arr[], int temp[], int left,
          int mid, int right)
{
    int i, j, k;
    int inv_count = 0;

    i = left; /* i is index for left subarray*/
    j = mid; /* j is index for right subarray*/
    k = left; /* k is index for resultant merged subarray*/
    while ((i <= mid - 1) && (j <= right)) {
        if (arr[i] <= arr[j]) {
            temp[k++] = arr[i++];
        }
        else {  // a[i] > a[j]
            temp[k++] = arr[j++];
            
            /* 诀窍在这里!*/
            /* this is tricky -- see above
            explanation/diagram for merge()*/
            inv_count = inv_count + (mid - i);
        }
    }

    /* Copy the remaining elements of left subarray
(if there are any) to temp*/
    while (i <= mid - 1)
        temp[k++] = arr[i++];

    /* Copy the remaining elements of right subarray
(if there are any) to temp*/
    while (j <= right)
        temp[k++] = arr[j++];

    /*Copy back the merged elements to original array*/
    for (i = left; i <= right; i++)
        arr[i] = temp[i];

    return inv_count;
}

// Driver code
int main()
{
    int arr[] = { 13,8,5,3,2,1 };
    int n = sizeof(arr) / sizeof(arr[0]);
    int ans = mergeSort(arr, n);
    cout << " Number of inversions are " << ans;
    return 0;
}

[问题二]
Recall the problem of finding the number of inversions. As in the course, we are given a sequence of n numbers a 1 , a 2 , . . . , a n a_1, a_2, ..., a_n , and we define an inversion to be a pair i < j such that a i &gt; a j a_i &gt; a_j .

We motivated the problem of counting inversions as a good measure of how different two orderings are. However, one might feel that this measure is too sensitive. Let’s call a pair significant inversion if i < j and a i &gt; 3 a j a_i &gt; 3a_j . Given an O(n log n) algorithm to count the number of significant inversions between two orderings.

The array contains N elements ( 1 N 100 , 000 1\le N\le100,000 ). All elements are in the range from 1 to 1,000,000,000.
把条件改了成了 i &lt; j i&lt;j a i &gt; 3 a j a_i&gt;3a_j

技巧跟问题一类似,只不过不能简单在 else 语句中对inv_count++了:

    if (arr[i] <= arr[j]) {
        temp[k++] = arr[i++];
    }
    else {  // a[i] > a[j]
        temp[k++] = arr[j++];
        /* 不能这么做了!*/
        inv_count = inv_count + (mid - i);
    }

需要改成下面的逻辑:
如果 a i &gt; 3 a j a_i&gt;3a_j ,那么 i i 右边的数都大于 3 a j 3a_j ,需要累积到inv_count里,然后可以让 j j 指针放心地右移了。
否则: a i a_i 虽然不满足大于 3 a j 3a_j 的条件,但是 i i 右边的数还是有可能满足的,所以需要一个新指针new_i向右搜寻到大于 3 a j 3a_j 的数,搜到了就累积到inv_count上。

if (arr[i] <= arr[j]) {
   temp[k++] = arr[i++];
}
else {  // arr[i] > arr[j]
    temp[k] = arr[j];

    for (int new_i = i; new_i < mid; new_i++) {
      if (arr[new_i] > 3 * arr[j]) {  // 之前是inv_count += mid - i
        inv_count += mid - new_i;
        break;
      }
   }
   k++;
   j++;
}

同时可能需要把inv_count改成long long,避免越界。
还需要把if (arr[new_i] > 3 * arr[j]) {改成if ((long long)arr[new_i] > 3L * (long long)arr[j]) {避免出现1,000,000,000 × 3超过int范围而溢出的情况。(比赛真的遇到这个问题了!)

发布了80 篇原创文章 · 获赞 22 · 访问量 5万+

猜你喜欢

转载自blog.csdn.net/u010099177/article/details/100710989