Table of Contents
Fetching ...

How to Find Fantastic AI Papers: Self-Rankings as a Powerful Predictor of Scientific Impact Beyond Peer Review

Buxin Su, Natalie Collina, Garrett Wen, Didong Li, Kyunghyun Cho, Jianqing Fan, Bingxin Zhao, Weijie Su

TL;DR

The paper tackles the challenge of identifying high-impact AI research amid rapid publication growth by testing authors' self-rankings of their own ICML submissions as predictors of future impact. Using a two-phase, large-scale ICML 2023 experiment, it shows that papers authors ranked highest tend to accumulate about twice as many citations as those ranked lowest, with particularly strong signal for the tail of highly cited work. Self-rankings outperform traditional reviewer scores in predicting future citations and remain robust after controlling for confounds such as preprint timing and self-citations. The work argues that author-derived comparative judgments provide a valuable, low-cost complement to peer review and discusses integrating self-rankings into conference decision-making, validation across additional conferences, and broader implications for scholarly evaluation.

Abstract

Peer review in academic research aims not only to ensure factual correctness but also to identify work of high scientific potential that can shape future research directions. This task is especially critical in fast-moving fields such as artificial intelligence (AI), yet it has become increasingly difficult given the rapid growth of submissions. In this paper, we investigate an underexplored measure for identifying high-impact research: authors' own rankings of their multiple submissions to the same AI conference. Grounded in game-theoretic reasoning, we hypothesize that self-rankings are informative because authors possess unique understanding of their work's conceptual depth and long-term promise. To test this hypothesis, we conducted a large-scale experiment at a leading AI conference, where 1,342 researchers self-ranked their 2,592 submissions by perceived quality. Tracking outcomes over more than a year, we found that papers ranked highest by their authors received twice as many citations as their lowest-ranked counterparts; self-rankings were especially effective at identifying highly cited papers (those with over 150 citations). Moreover, we showed that self-rankings outperformed peer review scores in predicting future citation counts. Our results remained robust after accounting for confounders such as preprint posting time and self-citations. Together, these findings demonstrate that authors' self-rankings provide a reliable and valuable complement to peer review for identifying and elevating high-impact research in AI.

How to Find Fantastic AI Papers: Self-Rankings as a Powerful Predictor of Scientific Impact Beyond Peer Review

TL;DR

The paper tackles the challenge of identifying high-impact AI research amid rapid publication growth by testing authors' self-rankings of their own ICML submissions as predictors of future impact. Using a two-phase, large-scale ICML 2023 experiment, it shows that papers authors ranked highest tend to accumulate about twice as many citations as those ranked lowest, with particularly strong signal for the tail of highly cited work. Self-rankings outperform traditional reviewer scores in predicting future citations and remain robust after controlling for confounds such as preprint timing and self-citations. The work argues that author-derived comparative judgments provide a valuable, low-cost complement to peer review and discusses integrating self-rankings into conference decision-making, validation across additional conferences, and broader implications for scholarly evaluation.

Abstract

Peer review in academic research aims not only to ensure factual correctness but also to identify work of high scientific potential that can shape future research directions. This task is especially critical in fast-moving fields such as artificial intelligence (AI), yet it has become increasingly difficult given the rapid growth of submissions. In this paper, we investigate an underexplored measure for identifying high-impact research: authors' own rankings of their multiple submissions to the same AI conference. Grounded in game-theoretic reasoning, we hypothesize that self-rankings are informative because authors possess unique understanding of their work's conceptual depth and long-term promise. To test this hypothesis, we conducted a large-scale experiment at a leading AI conference, where 1,342 researchers self-ranked their 2,592 submissions by perceived quality. Tracking outcomes over more than a year, we found that papers ranked highest by their authors received twice as many citations as their lowest-ranked counterparts; self-rankings were especially effective at identifying highly cited papers (those with over 150 citations). Moreover, we showed that self-rankings outperformed peer review scores in predicting future citation counts. Our results remained robust after accounting for confounders such as preprint posting time and self-citations. Together, these findings demonstrate that authors' self-rankings provide a reliable and valuable complement to peer review for identifying and elevating high-impact research in AI.

Paper Structure

This paper contains 16 sections, 10 figures, 4 tables.

Figures (10)

  • Figure 1: (a) The illustration for the Phase One survey experiment, with summary statistics (see Section \ref{['sec:summery']} for details and Figure \ref{['fig:survey_interface']} for the real interface). (b) Comparison of citation counts between high- and low-ranked submissions, revealing a statistically significant difference ($P = 1.05 \times 10^{-6}$; see Section \ref{['par:rank_vs_avg_ciation']}). (c) Proportions of high- and low-ranked submissions, categorized by how well they are cited (i.e., well-cited vs. less-cited) (see Section \ref{['par:rank_idfy_good']}). (d) Mean citation counts for submissions partitioned into high- and low-value groups according to three metrics: self-rankings, pre-rebuttal scores, and post-rebuttal scores. A larger difference in means indicates greater predictive power for the metric (see Section \ref{['sec:rank_vs_score']}).
  • Figure 2: High-ranked papers received about twice as many citations as low-ranked ones. (Cumulative) average number of citations in each two-month period from July 23, 2023 to November 22, 2024, comparing high-ranked papers (red bars) and low-ranked papers (blue bars), conditional on final accept/reject decisions. Left panel: accepted submissions. Right panel: rejected submissions. Across all intervals and decision categories, high-ranked papers consistently received substantially more citations than low-ranked papers.
  • Figure 3: Most of the well-cited papers fall into the high-ranked category. Empirical complementary cumulative distribution of citation counts, showing the proportion of papers with more than $C$ citations. The left panel presents accepted submissions; the right panel shows rejected ones. At every citation threshold $C$, a higher proportion of high-ranked papers exceed $C$ than low-ranked papers. Average citation counts for high- and low-ranked submissions are highlighted using vertical dashed lines.
  • Figure 4: Average number of citations from July 23, 2023 to November 22, 2024 for high- vs. low-ranked papers, grouped by pre-rebuttal and post-rebuttal review scores in ICML 2023. Among submissions with similar review scores, high-ranked papers still received significantly more citations than low-ranked ones. In particular, review scores failed to identify the most impactful papers, whereas authors' self-rankings succeeded in doing so.
  • Figure S.1: Screenshot of the first author survey. This survey was sent out on January 26, 2023, and closed on February 10, 2023. Among the 1,342 authors with multiple submissions who provided valid rankings, they took an average of around 4 days to submit the survey: 25% of them submitted the survey within 6.21 hours and 90% of them submitted the survey within 8.79 days.
  • ...and 5 more figures