Table of Contents
Fetching ...

Quantifying gendered citation imbalance in computer science conferences

Kazuki Nakajima, Yuya Sasaki, Sohei Tokuno, George Fletcher

TL;DR

This study investigates gendered citation imbalance in computer science conferences by constructing a dedicated OpenAlex–DBLP dataset with author gender and conference-rank metadata and by developing a family of reference models that preserve key network properties. The three models—Random-draws, Homophilic-draws, and Preferential-draws—evaluate how the number of citations, homophily in citations, and heterogeneity in citations received shape gendered citation patterns and ranking outcomes. The authors find that citation homophily strongly drives gender imbalances, that heterogeneity in citation counts plays a smaller role, and that the imbalance is most pronounced in top-ranked conferences and persists across subfields and in citation-based rankings like PageRank. The framework links network structure to gendered citation dynamics and suggests fairness-oriented interventions by adjusting network processes or rankings, with implications for research evaluation and conference prestige effects.

Abstract

The number of citations received by papers often exhibits imbalances in terms of author attributes such as country of affiliation and gender. While recent studies have quantified citation imbalance in terms of the authors' gender in journal papers, the computer science discipline, where researchers frequently present their work at conferences, may exhibit unique patterns in gendered citation imbalance. Additionally, understanding how network properties in citations influence citation imbalances remains challenging due to a lack of suitable reference models. In this paper, we develop a family of reference models for citation networks and investigate gender imbalance in citations between papers published in computer science conferences. By deploying these reference models, we found that homophily in citations is strongly associated with gendered citation imbalance in computer science, whereas heterogeneity in the number of citations received per paper has a relatively minor association with it. Furthermore, we found that the gendered citation imbalance is most pronounced in papers published in the highest-ranked conferences, is present across different subfields, and extends to citation-based rankings of papers. Our study provides a framework for investigating associations between network properties and citation imbalances, aiming to enhance our understanding of the structure and dynamics of citations between research publications.

Quantifying gendered citation imbalance in computer science conferences

TL;DR

This study investigates gendered citation imbalance in computer science conferences by constructing a dedicated OpenAlex–DBLP dataset with author gender and conference-rank metadata and by developing a family of reference models that preserve key network properties. The three models—Random-draws, Homophilic-draws, and Preferential-draws—evaluate how the number of citations, homophily in citations, and heterogeneity in citations received shape gendered citation patterns and ranking outcomes. The authors find that citation homophily strongly drives gender imbalances, that heterogeneity in citation counts plays a smaller role, and that the imbalance is most pronounced in top-ranked conferences and persists across subfields and in citation-based rankings like PageRank. The framework links network structure to gendered citation dynamics and suggests fairness-oriented interventions by adjusting network processes or rankings, with implications for research evaluation and conference prestige effects.

Abstract

The number of citations received by papers often exhibits imbalances in terms of author attributes such as country of affiliation and gender. While recent studies have quantified citation imbalance in terms of the authors' gender in journal papers, the computer science discipline, where researchers frequently present their work at conferences, may exhibit unique patterns in gendered citation imbalance. Additionally, understanding how network properties in citations influence citation imbalances remains challenging due to a lack of suitable reference models. In this paper, we develop a family of reference models for citation networks and investigate gender imbalance in citations between papers published in computer science conferences. By deploying these reference models, we found that homophily in citations is strongly associated with gendered citation imbalance in computer science, whereas heterogeneity in the number of citations received per paper has a relatively minor association with it. Furthermore, we found that the gendered citation imbalance is most pronounced in papers published in the highest-ranked conferences, is present across different subfields, and extends to citation-based rankings of papers. Our study provides a framework for investigating associations between network properties and citation imbalances, aiming to enhance our understanding of the structure and dynamics of citations between research publications.
Paper Structure (33 sections, 15 equations, 6 figures, 7 tables, 3 algorithms)

This paper contains 33 sections, 15 equations, 6 figures, 7 tables, 3 algorithms.

Figures (6)

  • Figure 1: Comparison of structural properties between the original citation network and reference models. The legend 'Original' indicates the result for the original network, 'Random' indicates that for the random-draws model, 'Homophilic' indicates that for the homophilic-draws model, and 'Preferential' indicates that for the preferential-draws model. In Figs. \ref{['fig:1']}(c) and \ref{['fig:1']}(d), we focused on the 55 countries of affiliation and the 625 research topics, each of which has one or more homophilic citations in the original network. We sorted the IDs in descending order by the number of homophilic citations in each case. Curves that completely or heavily overlap are indicated by arrows and labels.
  • Figure 2: Gender imbalance in citations received by conference papers in computer science.
  • Figure 3: Gendered citation imbalance across conference ranks.
  • Figure 4: Gendered citation imbalance across subfields.
  • Figure 5: Gender imbalance in citation-based rankings of conference papers.
  • ...and 1 more figures