How good is the h-index?
Ali Borji
TL;DR
The paper scrutinizes the $h$-index as a dominant yet imperfect metric for evaluating individual scientists, highlighting biases, manipulation risks, and field- and database-dependent variation. It surveys alternatives such as the $g$-index and PageRank-based approaches, and presents preliminary observations from an informal LinkedIn poll favoring a balanced emphasis on high-impact work rather than sheer quantity. The authors advocate combining quantitative metrics with qualitative assessments and propose a machine learning-based, human-inspired evaluation framework (e.g., using perceptual-loss concepts) to emulate nuanced scientific judgments. The work underscores the need for multi-faceted evaluation schemes that better reflect true scientific contribution and discourage metric gaming, with broader implications for funding, rankings, and career advancement.
Abstract
The h-index has become a widely used metric for evaluating the productivity and citation impact of researchers. Introduced by physicist Jorge E. Hirsch in 2005, the h-index measures both the quantity (number of publications) and quality (citations) of a researcher's output. While it has gained popularity for its simplicity and practicality, the h-index is not without its limitations. We examine the strengths and weaknesses of this metric, presenting preliminary experimental results that demonstrate the limitations of the h-index. We also propose a potential solution. The primary aim of this work is to shed light on the shortcomings of the h-index and its implications for ranking scientists, motivating them, allocating funding, and advancing science.
