Text-Driven Neural Collaborative Filtering Model for Paper Source Tracing
Aobo Xu, Bingyu Chang, Qingpeng Liu, Ling Jian
TL;DR
This paper tackles Paper Source Tracing (PST) by reframing it as a recommendation problem over a citation knowledge graph. It introduces a Neural Collaborative Filtering (NCF) model that integrates textual attributes processed by SciBERT, enabling end-to-end learning from both relational structure and content. The approach achieves a Mean Average Precision of $MAP=0.37814$, outperforming several baselines and demonstrating the value of text-aware recommendations for PST. The work contributes a novel application of NCF to PST, provides ablations highlighting the importance of textual data, and offers publicly available code to facilitate future research and practical adoption. The findings suggest promising directions for incorporating graph reasoning to further enhance PST performance.
Abstract
Identifying significant references within the complex interrelations of a citation knowledge graph is challenging, which encompasses connections through citations, authorship, keywords, and other relational attributes. The Paper Source Tracing (PST) task seeks to automate the identification of pivotal references for given scholarly articles utilizing advanced data mining techniques. In the KDD CUP OAG-Challenge PST track, we design a recommendation-based framework tailored for the PST task. This framework employs the Neural Collaborative Filtering (NCF) model to generate final predictions. To process the textual attributes of the papers and extract input features for the model, we utilize SciBERT, a pre-trained language model. According to the experimental results, our method achieved a score of 0.37814 on the Mean Average Precision (MAP) metric, outperforming baseline models and ranking 11th among all participating teams. The source code is publicly available at https://github.com/MyLove-XAB/KDDCupFinal.
