Building an Explainable Graph-based Biomedical Paper Recommendation System (Technical Report)
Hermann Kroll, Christin K. Kreutz, Bill Matthias Thang, Philipp Schaer, Wolf-Tilo Balke
TL;DR
This work tackles scalable, explainable biomedical paper recommendation by introducing XGPRec, a graph-based system that represents each paper as a concept interaction graph and explains recommendations via shared graph patterns. It uses a two-stage retrieval pipeline (FSConcept/FSNode/FSCore for candidate retrieval) and a second stage that blends graph-overlap with BM25 text scoring, with graph explanations highlighted in the UI. On large-scale biomedical corpora (~37M MEDLINE documents) and standard benchmarks (PM2020, Genomics, RELISH), XGPRec achieves comparable or higher recall and precision than PubMed-recommendations in several settings, while offering explainable graph-based justifications and faster first-stage performance. The results support the practical value of graph-based explainability in digital libraries and provide a foundation for further enhancement of explanations and hybrid recommendation strategies, with code made openly available for adoption by other digital libraries.
Abstract
Digital libraries provide different access paths, allowing users to explore their collections. For instance, paper recommendation suggests literature similar to some selected paper. Their implementation is often cost-intensive, especially if neural methods are applied. Additionally, it is hard for users to understand or guess why a recommendation should be relevant for them. That is why we tackled the problem from a different perspective. We propose XGPRec, a graph-based and thus explainable method which we integrate into our existing graph-based biomedical discovery system. Moreover, we show that XGPRec (1) can, in terms of computational costs, manage a real digital library collection with 37M documents from the biomedical domain, (2) performs well on established test collections and concept-centric information needs, and (3) generates explanations that proved to be beneficial in a preliminary user study. We share our code so that user libraries can build upon XGPRec.
