Table of Contents
Fetching ...

ILCiteR: Evidence-grounded Interpretable Local Citation Recommendation

Sayar Ghosh Roy, Jiawei Han

TL;DR

This work introduces the evidence-grounded local citation recommendation task, and contributes a novel dataset for the evidence-grounded local citation recommendation task and demonstrates the efficacy of the proposed conditional neural rank-ensembling approach for re-ranking evidence spans.

Abstract

Existing Machine Learning approaches for local citation recommendation directly map or translate a query, which is typically a claim or an entity mention, to citation-worthy research papers. Within such a formulation, it is challenging to pinpoint why one should cite a specific research paper for a particular query, leading to limited recommendation interpretability. To alleviate this, we introduce the evidence-grounded local citation recommendation task, where the target latent space comprises evidence spans for recommending specific papers. Using a distantly-supervised evidence retrieval and multi-step re-ranking framework, our proposed system, ILCiteR, recommends papers to cite for a query grounded on similar evidence spans extracted from the existing research literature. Unlike past formulations that simply output recommendations, ILCiteR retrieves ranked lists of evidence span and recommended paper pairs. Secondly, previously proposed neural models for citation recommendation require expensive training on massive labeled data, ideally after every significant update to the pool of candidate papers. In contrast, ILCiteR relies solely on distant supervision from a dynamic evidence database and pre-trained Transformer-based Language Models without any model training. We contribute a novel dataset for the evidence-grounded local citation recommendation task and demonstrate the efficacy of our proposed conditional neural rank-ensembling approach for re-ranking evidence spans.

ILCiteR: Evidence-grounded Interpretable Local Citation Recommendation

TL;DR

This work introduces the evidence-grounded local citation recommendation task, and contributes a novel dataset for the evidence-grounded local citation recommendation task and demonstrates the efficacy of the proposed conditional neural rank-ensembling approach for re-ranking evidence spans.

Abstract

Existing Machine Learning approaches for local citation recommendation directly map or translate a query, which is typically a claim or an entity mention, to citation-worthy research papers. Within such a formulation, it is challenging to pinpoint why one should cite a specific research paper for a particular query, leading to limited recommendation interpretability. To alleviate this, we introduce the evidence-grounded local citation recommendation task, where the target latent space comprises evidence spans for recommending specific papers. Using a distantly-supervised evidence retrieval and multi-step re-ranking framework, our proposed system, ILCiteR, recommends papers to cite for a query grounded on similar evidence spans extracted from the existing research literature. Unlike past formulations that simply output recommendations, ILCiteR retrieves ranked lists of evidence span and recommended paper pairs. Secondly, previously proposed neural models for citation recommendation require expensive training on massive labeled data, ideally after every significant update to the pool of candidate papers. In contrast, ILCiteR relies solely on distant supervision from a dynamic evidence database and pre-trained Transformer-based Language Models without any model training. We contribute a novel dataset for the evidence-grounded local citation recommendation task and demonstrate the efficacy of our proposed conditional neural rank-ensembling approach for re-ranking evidence spans.
Paper Structure (27 sections, 3 figures, 9 tables)

This paper contains 27 sections, 3 figures, 9 tables.

Figures (3)

  • Figure 1: An overview of the local citation recommendation task for scientific research papers.
  • Figure 2: An overview of ILCiteR: our proposed system for evidence-grounded citation recommendation.
  • Figure 3: Conditional neural rank ensembling -- re-rank candidate evidence text spans based on lexical and semantic similarity to the query $q$.