leetcode-274-h-index

題目

Given an array of integers citations where citations[i] is the number of citations a researcher received for their ith paper, return the researcher’s h-index.

According to the definition of h-index on Wikipedia: The h-index is defined as the maximum value of h such that the given researcher has published at least h papers that have each been cited at least h times.

題目連結

Example 1

Input: citations = [3,0,6,1,5]
Output: 3
Explanation: [3,0,6,1,5] means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively.
Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.

Example 2

Input: citations = [3,0,6,1,5]
Output: 3
Explanation: [3,0,6,1,5] means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively.
Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.

Constraints:

n == citations.length
1 <= n <= 5000
0 <= citations[i] <= 1000

解釋題目

給你一個 citations array，index 代表每一篇不同的文章，citations[index] 代表每篇文章被引用的次數，這題要找出至少有 h 篇，被引用 h 次的值，符合的 h 可能會有多個，要找出最大值 h。

思路(1)

蠻重要的概念是要先把 citations array 由小排到大排序，比較好想。
以 [0, 1, 3, 5, 6] 為例，如果遍歷整個 array，可以發觀察到，引用次數 > 0 次的文章篇數有 5 篇 ( len(citations) - i )；引用次數 > 1 次的文章篇數有 4 篇；；引用次數 > 3 次的文章篇數有 3 篇，以此類推。
於是可以整理出以下表格:

索引 i 引用次數 citation len(citations) - i
( 剩下的論文數量 )

0 0 5

1 1 4

2 3 3

3 5 2

4 6 1
所以，每遍歷一次，就可以找出 h-index，也就是取 min(citations[i], n - i)
比較每次遍歷完的 h-index，找出最大的 h-index 就是答案了。

索引 i	引用次數 citation	len(citations) - i ( 剩下的論文數量 )
0	0	5
1	1	4
2	3	3
3	5	2
4	6	1

程式碼 (1)

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        citations.sort()
        n = len(citations)
        h = 0
        for i in range(n):
            temp = min(citations[i], n - i)
            h = max(temp, h)
        return h

思路(2)

另外一個想法，我覺得很酷，是拿 引用次數 跟 剩餘篇數 做比較。
因為 citations 已經由小到大排列好了，所以引用次數會越來越大。當 引用次數 >= 剩餘篇數 時，代表剩餘篇數 ( 假設為 3 篇) 的每一篇的引用次數都至少有 3 次，因為引用次數 >= 剩餘篇數 3，用實際數字比較的話，就會發現: 這不就是 h-index 的定義嗎 ?!

研究人員有 H 篇論文，其中每篇論文的引用次數至少為 H
這個條件可以延伸解釋: 其他論文的引用次數不多於 H，因為要找最大的 H。所以當引用次數 >= 剩餘篇數 時，我要馬上 返回剩餘篇數，之後剩餘篇數 ( 也就是 h-index ) 會越來越小。
以 [0, 1, 3, 5, 6] 為例，如果是遍歷到 1，代表引用次數 1 次的有 4 篇文章 ( 剩餘篇數是 4 )。h-index 不能回傳 4，因為不能保證剩餘篇數的每一篇文章的引用次數都至少 4 次，後面可能還有引用 2 次或 3 次的。而越到後面，剩餘篇數會越少， h-index 自然不會是 4。
因為引用次數會越來大，而剩餘次數會越來越小，所以越往後面遍歷，剩餘次數 ( 就是 h-index ) 也會越來越小。因此，以當引用次數 >= 剩餘篇數時，馬上 return 剩餘篇數即可。

程式碼 (2)

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        n = len(citations)
        citations.sort()

        for i, citation in enumerate(citations):
            if citation >= n - i:
                return n - i

        return 0