According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."
For example, given citations = [3, 0, 6, 1, 5], which means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, his h-index is 3.
Note: If there are several possible values for h, the maximum one is taken as the h-index.
h-index ,又称为h指数或h因子(h-factor),是一种评价学术成就的新方法。h代表“高引用次数”(high citations),一名科研人员的h指数是指他至多有h篇论文分别被引用了至少h次, 剩余的论文引用的次数不多于h次。题目中给定了一个数组,数组的长度就代表了论文的个数,每个元素的值代表了引用的次数,让我们从中找到H-index.
首先我们设定h的初始值为1。然后我们将数组排序,从最后一个元素开始与h比较,1,如果比h大,就让h加1,继续往前比较,直到遇到等于或者小于h的情况;2, 如果等于h, 就返回h; 3,如果小于h, 就返回h - 1。在第1步中,如果一直都大于h, 我们就返回数组的长度。我们用到了排序,这样时间复杂度为O(nlogn)。代码如下:
public class Solution { public int hIndex(int[] citations) { if(citations == null || citations.length == 0) return 0; Arrays.sort(citations); int h = 1; for(int i = citations.length - 1; i >= 0; i--) { if(citations[i] > h) { h ++; } else if(citations[i] == h) { return h; } else { return h - 1; } } return citations.length; } }
另外一种方法只用O(n)的时间复杂度就可以解决,但是我们要用到O(n)的空间。类似于计数排序的方法,创建一个数组count来记录文章引用次数的情况,然后从count的最后一个元素开始累加,当累加的数字大于或等于当前下标的时候就返回。代码如下:
public class Solution { public int hIndex(int[] citations) { if(citations == null || citations.length == 0) return 0; int len = citations.length; int[] count = new int[len + 1]; for(int i = 0; i < len; i++) { if(citations[i] >= len) count[len] ++; else count[citations[i]] ++; } for(int i = count.length - 1; i > 0; i--) { if(count[i] >= i) return i; count[i - 1] += count[i]; } return 0; } }