Leveraging GANs for citation intent classification and its impact on citation network analysis
Davi A. Bezerra, Filipi N. Silva, Diego R. Amancio
TL;DR
This work tackles citation-intent analysis by adopting a semi-supervised GAN-BERT framework augmented with SciBERT, enabling robust classifcation with limited labeled data. It demonstrates competitive performance on standard benchmarks (e.g., a $F_1$ of $88.74$ on SciCite and $81.75$ on ACL) while substantially reducing model size relative to state-of-the-art, enhancing practicality for large-scale corpora. Beyond classification, the paper couples intent detection with network analysis to show that filtering citations by intent can dramatically reshape centrality-based rankings and network structure, highlighting potential biases in traditional bibliometrics. The findings suggest that intent-aware analysis can refine impact measurements and inform more nuanced interpretations of scholarly influence, with implications for research evaluation and bias monitoring.
Abstract
Citations play a fundamental role in the scientific ecosystem, serving as a foundation for tracking the flow of knowledge, acknowledging prior work, and assessing scholarly influence. In scientometrics, they are also central to the construction of quantitative indicators. Not all citations, however, serve the same function: some provide background, others introduce methods, or compare results. Therefore, understanding citation intent allows for a more nuanced interpretation of scientific impact. In this paper, we adopted a GAN-based method to classify citation intents. Our results revealed that the proposed method achieves competitive classification performance, closely matching state-of-the-art results with substantially fewer parameters. This demonstrates the effectiveness and efficiency of leveraging GAN architectures combined with contextual embeddings in intent classification task. We also investigated whether filtering citation intents affects the centrality of papers in citation networks. Analyzing the network constructed from the unArXiv dataset, we found that paper rankings can be significantly influenced by citation intent. All four centrality metrics examined- degree, PageRank, closeness, and betweenness - were sensitive to the filtering of citation types. The betweenness centrality displayed the greatest sensitivity, showing substantial changes in ranking when specific citation intents were removed.
