The H-Index: good or bad?
Anyone working in academia is well-aware of the ubiquity of h-indexes. To many professors and graduate students, the h-index is perhaps the most widely used metric in determining the influence of one’s work. This single number is used to convey the influence you have had in your research career, is pivotal to career advancement, and used in part to determine the relative influence of difference academic institutions. Given the ubiquity and power of such an index on the academic sphere, we must pause for a second and ask, is this actually the best method for ranking the merit of different scientists? Have we perhaps learned better alternatives of ranking publications within our own course?
First off, I shall define the h-index:
The h-index of an author, h, is the largest number x such that there are x articles published by the author which have at least x references. In other words, h is the maximum number of publications by a scientist that were cited at least h times.
As can be seen, this metric (developed by Jorge Eduardo Hirsch of UCSD in 2005) is used to measure the quality and quantity of a researcher’s work. The inventor, Hirsch himself, proposes that after 20 years of research, an h-index of 20 is good, 40 is outstanding, and 60 is exceptional. It is an indicator that a researcher is reliable, consistently engaged in meaningful science and has publications that are largely adopted. However, time and time again, the h-index has proved ineffective to honour the importance of scientific endeavours.
First, consider the young and exceptional scientist. If in their short career, they have published 2 great papers, with thousands of citations, their h-index is just as good as another scientist who has worked for 20 years and published 2o papers, 2 of which each have 2 citations. Their is an implicit agism in the h-index that works against the interests of meritocracy.
Second, consider the scientist Y that is consistently published by the best journals. H-index does not discriminate between the authority of different hubs, and the achievement of being published in a great journal is treated equal to being published in the worst one. The h-index does not take into account the fact that some citations are more impressive than others, and more indicative of meaningful work. It is not fair to treat every referrer with the same sense of credibility.
Third, authors are encouraged by h-index to produce less important publications that would enhance their index, as the h-index is bounded by the minimum number of articles. For instance, I could compartmentalise my research into 4 different research papers for a better h-index, even though the ideas might be better expressed in a single research paper. This creates a culture in academia of prioritising quantity: publishing more papers to convey influence, instead of focusing on the quality and merit of the science itself.
I could not help but pause and think, have we learned a better model in our Networks class? Could we not provide a better score than the H-Index?
I came across a most though-provoking article in PLOS, a non-profit tech and medicine publisher that contains open-access journals:
The Pagerank-Index: Going beyond Citation Counts in Quantifying Scientific Impact of Researchers
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.013479
They propose using the page-rank algorithm (as discussed in class) to rank publications in the citation network. Each node gets a value after this process, which can then be distributed to each author, and the summation of all page-rank values is obtained for every author. This can then be compared to all other author values to form the percentile.
The advantage of doing so, is that PageRank can compare the sources of information and determine which references are more-trustworthy. As discussed in lectures, PageRank is calculated recursively and depends on the metric of all pages that link to it. Each page spreads it vote equally among all out-links. If a page is linked to by many high ranked pages, it achieves a high rank.
Here, not all citations are equal, and a publications is important if it is pointed to by other important publications. This is the beauty of PageRank, an elegant solution which we have covered in our course.
In this case, we make the scientific world more meritocratic. We give the potential to young authors to be taken seriously, if they have already produced valuable works. Further, we give credence to researchers that are being published in amazing scientific journals over mediocre ones. We could also implement a variance of HITS to achieve similar outcomes, and there are a myriad of strategies we have learned in class that could create a more fair academic environment.
In conclusion, the H-index should be forgotten! Let the academic world move forward, and benefit from the might of Networks and Google’s innovation. After all, Google Scholar is one of the most ubiquitous users of the h-index, and the company itself could lead the way by reverting back to their own early innovations! Let us use the PageRank algorithm to evaluate scientific research in a fair manner!