Ranking Narrative Query Graphs for Biomedical Document Retrieval (Technical Report)

Hermann Kroll; Pascal Sackhoff; Timo Breuer; Ralf Schenkel; Wolf-Tilo Balke

Ranking Narrative Query Graphs for Biomedical Document Retrieval (Technical Report)

Hermann Kroll, Pascal Sackhoff, Timo Breuer, Ralf Schenkel, Wolf-Tilo Balke

TL;DR

This work addresses the challenge of ranking graph-based narrative queries over biomedical documents, moving beyond exact-match retrieval to unsupervised, graph-structure–driven ranking. It introduces GraphRank, which combines multiple signals—extraction confidence, tf-idf edge scores, concept coverage, and relational similarity—into a unified fragment score and then selects the best fragment per document. It also adds Partial Matches and ontological expansion to improve recall and handle ontology-driven generalization, all without training data. Evaluations on PM2017-2020 and TREC-COVID show recall and precision gains in concept-centric scenarios, with some limitations when queries lack precise domain concepts; the approach integrates directly into existing digital libraries and reduces reliance on supervised learning.

Abstract

Keyword-based searches are today's standard in digital libraries. Yet, complex retrieval scenarios like in scientific knowledge bases, need more sophisticated access paths. Although each document somewhat contributes to a domain's body of knowledge, the exact structure between keywords, i.e., their possible relationships, and the contexts spanned within each single document will be crucial for effective retrieval. Following this logic, individual documents can be seen as small-scale knowledge graphs on which graph queries can provide focused document retrieval. We implemented a full-fledged graph-based discovery system for the biomedical domain and demonstrated its benefits in the past. Unfortunately, graph-based retrieval methods generally follow an 'exact match' paradigm, which severely hampers search efficiency, since exact match results are hard to rank by relevance. This paper extends our existing discovery system and contributes effective graph-based unsupervised ranking methods, a new query relaxation paradigm, and ontological rewriting. These extensions improve the system further so that users can retrieve results with higher precision and higher recall due to partial matching and ontological rewriting.

Ranking Narrative Query Graphs for Biomedical Document Retrieval (Technical Report)

TL;DR

Abstract

Ranking Narrative Query Graphs for Biomedical Document Retrieval (Technical Report)

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)