Contextualizing Generated Citation Texts
Biswadip Mandal, Xiangci Li, Jessica Ouyang
TL;DR
This work targets the misalignment between generated citation texts and their surrounding context in abstractive citation generation. It reframes the task by training models to output the entire context window with the target citation filled, using the context as a prompt to determine topic and stance. Through a Longformer-Encoder-Decoder setup and the CORWA dataset, the authors show that contextualized citations are preferred for relevance and coherence by human evaluators, demonstrating that the approach can be applied to existing models via retraining. The study discusses patterns in contextual cues, acknowledges evaluation limitations, and highlights practical implications for producing contextually grounded citations in NLP research communication.
Abstract
Abstractive citation text generation is usually framed as an infilling task, where a sequence-to-sequence model is trained to generate a citation given a reference paper and the context window around the target; the generated citation should be a brief discussion of the reference paper as it relates to the citing context. However, examining a recent LED-based citation generation system, we find that many of the generated citations are generic summaries of the reference papers main contribution, ignoring the citation contexts focus on a different topic. To address this problem, we propose a simple modification to the citation text generation task: the generation target is not only the citation itself, but the entire context window, including the target citation. This approach can be easily applied to any abstractive citation generation system, and our experimental results show that training in this way is preferred by human readers and allows the generation model to make use of contextual clues about what topic to discuss and what stance to take.
