Table of Contents
Fetching ...

SCIRGC: Multi-Granularity Citation Recommendation and Citation Sentence Preference Alignment

Xiangyu Li, Jingqiang Chen

TL;DR

The paper addresses the challenge of automatically generating accurate, well-contextualized citations by proposing SciRGC, a two-module framework that first retrieves citation-appropriate articles and then generates inference-based citation sentences at precise locations. The retrieval module blends local context encoding with global citation-network collaborative filtering and a reranker that accounts for citation intent. The generation module employs Chain-of-Thought reasoning with LoRA-based fine-tuning and Direct Preference Optimization to align outputs with human preferences, alongside a new CITEVAL multidimensional evaluation framework. Experimental results on multiple datasets show improved citation retrieval accuracy and higher-quality, human-aligned citation sentences, with reduced inference costs compared to large language models.

Abstract

Citations are crucial in scientific research articles as they highlight the connection between the current study and prior work. However, this process is often time-consuming for researchers. In this study, we propose the SciRGC framework, which aims to automatically recommend citation articles and generate citation sentences for citation locations within articles. The framework addresses two key challenges in academic citation generation: 1) how to accurately identify the author's citation intent and find relevant citation papers, and 2) how to generate high-quality citation sentences that align with human preferences. We enhance citation recommendation accuracy in the citation article recommendation module by incorporating citation networks and sentiment intent, and generate reasoning-based citation sentences in the citation sentence generation module by using the original article abstract, local context, citation intent, and recommended articles as inputs. Additionally, we propose a new evaluation metric to fairly assess the quality of generated citation sentences. Through comparisons with baseline models and ablation experiments, the SciRGC framework not only improves the accuracy and relevance of citation recommendations but also ensures the appropriateness of the generated citation sentences in context, providing a valuable tool for interdisciplinary researchers.

SCIRGC: Multi-Granularity Citation Recommendation and Citation Sentence Preference Alignment

TL;DR

The paper addresses the challenge of automatically generating accurate, well-contextualized citations by proposing SciRGC, a two-module framework that first retrieves citation-appropriate articles and then generates inference-based citation sentences at precise locations. The retrieval module blends local context encoding with global citation-network collaborative filtering and a reranker that accounts for citation intent. The generation module employs Chain-of-Thought reasoning with LoRA-based fine-tuning and Direct Preference Optimization to align outputs with human preferences, alongside a new CITEVAL multidimensional evaluation framework. Experimental results on multiple datasets show improved citation retrieval accuracy and higher-quality, human-aligned citation sentences, with reduced inference costs compared to large language models.

Abstract

Citations are crucial in scientific research articles as they highlight the connection between the current study and prior work. However, this process is often time-consuming for researchers. In this study, we propose the SciRGC framework, which aims to automatically recommend citation articles and generate citation sentences for citation locations within articles. The framework addresses two key challenges in academic citation generation: 1) how to accurately identify the author's citation intent and find relevant citation papers, and 2) how to generate high-quality citation sentences that align with human preferences. We enhance citation recommendation accuracy in the citation article recommendation module by incorporating citation networks and sentiment intent, and generate reasoning-based citation sentences in the citation sentence generation module by using the original article abstract, local context, citation intent, and recommended articles as inputs. Additionally, we propose a new evaluation metric to fairly assess the quality of generated citation sentences. Through comparisons with baseline models and ablation experiments, the SciRGC framework not only improves the accuracy and relevance of citation recommendations but also ensures the appropriateness of the generated citation sentences in context, providing a valuable tool for interdisciplinary researchers.

Paper Structure

This paper contains 25 sections, 6 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: In the process of citation generation, it is first necessary to infer the citation intent, then to find the citation article, and finally to generate the reasoning-based citation sentence.
  • Figure 2: Process for implementing citation recommendation and generation in the SciRGC framework
  • Figure 3: Two collaborative filtering algorithms, dashed boxes represent two papers that are similar, and single arrows indicate citation of the paper.
  • Figure 4: The Encoder proposed in the Recall phase consists of a paragraph encoder and a document encoder, where different weights are assigned to different paragraphs by adding paragraph types in the document encoder section.
  • Figure 5: The three-phase citation generation framework proposed in this paper
  • ...and 2 more figures