H-Index H指数

给定一位研究者论文被引用次数的数组（被引用次数是非负整数）。编写一个方法，计算出研究者的 h 指数。

h 指数的定义: “一位有 h 指数的学者，代表他（她）的 N 篇论文中至多有 h 篇论文，分别被引用了至少 h 次，其余的 N - h 篇论文每篇被引用次数不多于 h 次。”

示例:

输入: citations = [3,0,6,1,5]
输出: 3 
解释: 给定数组表示研究者总共有 5篇论文，每篇论文相应的被引用了 3, 0, 6, 1, 5次。由于研究者有 3 篇论文每篇至少被引用了 3次，其余两篇论文每篇被引用不多于 3次，所以她的 h 指数是 3。

说明: 如果 h 有多种可能的值，h 指数是其中最大的那个。

思路一：果然刷题会产生一些固定思维，这道题我的直观想法就是堆排序，首先用hash表对每篇文章的次数进行计数，然后把<论文引用次数，有几篇这样的论文>放到最大堆中，设置全局变量h，每次取出堆顶元素的有几篇这样的论文，加上h，如果大于等于n-做了几次取堆操作，就返回n-做了几次取堆操作。由于用到了堆排序，所以时间复杂度是O(nlogn)。

参考代码：

class Solution {
public:
    int hIndex(vector<int>& citations) {
	int count = 0,low=citations.size();
	unordered_map<int, int> hash;
	for (auto citation : citations) hash[citation]++;
	priority_queue<pair<int, int>> pq;
	for (auto it = hash.begin(); it != hash.end(); it++) {
		pq.push(make_pair(it->first, it->second));
	}
	while (!pq.empty()) {
		while (!pq.empty() && pq.top().first >= low) {
			count+=pq.top().second;
			pq.pop();
		}
		if (count >= low) return low;
		if (!pq.empty()) low = (low-1);
	}
	return 0;      
    }
};

思路二：其实这道题可以优化成O(n)的时间复杂度，具体为：利用桶排序的原理，当我们进行计数时，我们只申请一个大小为n+1（n为citations的长度）的数组，如果数字大于等于n，我们都把计数放到下标为n的位置上，这样我们从右往左遍历就是由大到小，不需要进行排序了。后面的思路和上面一样。

参考代码：

class Solution {
public:
int hIndex(vector<int>& citations) {
	vector<int> hash(citations.size() + 1, 0);
	int n = citations.size();
	for (auto cita : citations) {
		if (cita >= n) hash[n]++;
		else hash[cita]++;
	}
	int count = 0;
	for (int i = n; i >= 0; i--) {
		count += hash[i];
		if (count >= i) return i;
	}
	return 0;
}
};

猜你喜欢