Table of Contents
Fetching ...

Enhancing Peer Review in Astronomy: A Machine Learning and Optimization Approach to Reviewer Assignments for ALMA

John M. Carpenter, Andrea Corvillón, Nihar B. Shah

TL;DR

The paper tackles the scalability challenge of ALMA's peer review by integrating Latent Dirichlet Allocation–based topic modeling of proposals with reviewer expertise inferred from their proposal history, then solving a leximin fairness optimization (adapted from PeerReview4All) to assign reviews. Key findings show the Cycle 10 deployment yielded a median proposal–reviewer similarity of $0.71$ (up from $0.20$ in Cycle 9) and increased reviewers reporting expertise from $45\%$ to $65\%$, while eliminating reassignment due to mismatches and saving $3$–$5$ days of manual effort per cycle. The approach demonstrates robust automation for large-scale, equitable reviewer assignments, with manageable topology trade-offs and clear pathways for future enhancement, including addressing manipulation risks and advancing NLP with transformer-based models. This work has broad implications for modernizing scientific peer review in astronomy and other fields facing similar scalability pressures.

Abstract

The increasing volume of papers and proposals that undergo peer review emphasizes the pressing need for greater automation to effectively manage the growing scale. In this study, we present the deployment and evaluation of machine learning and optimization techniques to assign proposals to reviewers that were developed for the Atacama Large Millimeter/submillimeter Array (ALMA) during the Cycle 10 Call for Proposals issued in 2023. Using topic modeling algorithms, we identify the proposal topics and assess reviewers' expertise based on their previous ALMA proposal submissions. We then apply an adapted version of the assignment optimization algorithm from PeerReview4All (Stelmakh et al. 2021) to maximize the alignment between proposal topics and reviewer expertise. Our evaluation shows a significant improvement in matching reviewer expertise: the median similarity score between the proposal topic and reviewer expertise increased by 51 percentage points compared to the previous cycle, and the percentage of reviewers reporting expertise in their assigned proposals rose by 20 percentage points. Furthermore, the assignment process proved highly effective in that no proposals required reassignment due to significant mismatches, resulting in a savings of 3 to 5 days of manual effort.

Enhancing Peer Review in Astronomy: A Machine Learning and Optimization Approach to Reviewer Assignments for ALMA

TL;DR

The paper tackles the scalability challenge of ALMA's peer review by integrating Latent Dirichlet Allocation–based topic modeling of proposals with reviewer expertise inferred from their proposal history, then solving a leximin fairness optimization (adapted from PeerReview4All) to assign reviews. Key findings show the Cycle 10 deployment yielded a median proposal–reviewer similarity of (up from in Cycle 9) and increased reviewers reporting expertise from to , while eliminating reassignment due to mismatches and saving days of manual effort per cycle. The approach demonstrates robust automation for large-scale, equitable reviewer assignments, with manageable topology trade-offs and clear pathways for future enhancement, including addressing manipulation risks and advancing NLP with transformer-based models. This work has broad implications for modernizing scientific peer review in astronomy and other fields facing similar scalability pressures.

Abstract

The increasing volume of papers and proposals that undergo peer review emphasizes the pressing need for greater automation to effectively manage the growing scale. In this study, we present the deployment and evaluation of machine learning and optimization techniques to assign proposals to reviewers that were developed for the Atacama Large Millimeter/submillimeter Array (ALMA) during the Cycle 10 Call for Proposals issued in 2023. Using topic modeling algorithms, we identify the proposal topics and assess reviewers' expertise based on their previous ALMA proposal submissions. We then apply an adapted version of the assignment optimization algorithm from PeerReview4All (Stelmakh et al. 2021) to maximize the alignment between proposal topics and reviewer expertise. Our evaluation shows a significant improvement in matching reviewer expertise: the median similarity score between the proposal topic and reviewer expertise increased by 51 percentage points compared to the previous cycle, and the percentage of reviewers reporting expertise in their assigned proposals rose by 20 percentage points. Furthermore, the assignment process proved highly effective in that no proposals required reassignment due to significant mismatches, resulting in a savings of 3 to 5 days of manual effort.

Paper Structure

This paper contains 12 sections, 2 equations, 5 figures.

Figures (5)

  • Figure 1: Flowchart of the proposal assignment process used by ALMA in Cycle 10, starting from the proposal submission by the PI and ending with the reviewer providing their rankings and self-assessment of their expertise on individual assignments.
  • Figure 2: Comparison of scientific categories for the closest matching proposals identified using the Latent Dirichlet Allocation model, alongside the category of the submitted proposal. The x-axis represents the categories of the submitted proposals, while the y-axis displays the categories of the ten most similar proposals. Each column is normalized to unity. The results are shown for proposals submitted to ALMA Cycle 10. For each submitted category, the closest matching proposals identified by the machine learning model are found within the same category.
  • Figure 3: Cumulative distributions of the similarities between the reviewer's expertise and the topic of the assigned proposal in Cycle 8 (left) and Cycle 9 (right), when categories and keywords were used to assign proposals. The similarities were computed retroactively to evaluate the machine learning algorithm. The results are shown for proposal assignments where reviewers indicated they are experts (solid curve), had some knowledge of the proposal (dashed), and had little/no knowledge of the proposal (dotted). Similarities tend to be higher when reviewers indicate expertise in the proposal, suggesting that these similarities have predictive value for assessing the suitability of reviewer assignments.
  • Figure 4: Histogram of the similarities between the reviewer expertise and the assigned proposal for Cycle 9 under the old algorithm (blue, hatched histogram), and in Cycle 10 for the new algorithm (orange, dotted histogram). With the new assignment process adopted in Cycle 10, the median similarity increased to 0.71 from 0.20 in Cycle 9, indicating a higher level of expertise in the proposal assignments was achieved with the new assignment process (see Section \ref{['subsec:assignments']}).
  • Figure 5: Histogram of the reviewer's self-expertise on their proposal assignments in Cycles 8--10. With the new assignment algorithm implemented in Cycle 10, the percentage of reviewers reporting expertise in their assigned proposals increased by 20 percentage points compared to the previous cycle, while the percentage of those declaring little or no knowledge was reduced by half.