Supporting Evidence-Based Medicine by Finding Both Relevant and Significant Works
Sameh Frihat, Norbert Fuhr
TL;DR
This work tackles the challenge of retrieving medical literature that is both relevant and reliable by automatically assigning Level of Evidence (LoE) to publications. It develops a suite of classification models, including RF and several PubMedBERT-based architectures (single-label, regression, multi-label, and ensemble), trained on a curated Oncology Guidelines dataset and enriched with PubMed abstracts, achieving a macro-F1 of up to 0.83 with the ensemble. It then demonstrates that using LoE as a retrieval filter over large-scale Medline data significantly improves performance on infNDCG, R-Prec, and NDCG@10, particularly for high-evidence documents (LoE1). The results suggest a viable path to integrating LoE-based filtering into search engines like PubMed, enabling clinicians and researchers to access higher-quality evidence more efficiently, while also highlighting limitations and future work in assessing study quality beyond design and incorporating broader evidence sources.
Abstract
In this paper, we present a new approach to improving the relevance and reliability of medical IR, which builds upon the concept of Level of Evidence (LoE). LoE framework categorizes medical publications into 7 distinct levels based on the underlying empirical evidence. Despite LoE framework's relevance in medical research and evidence-based practice, only few medical publications explicitly state their LoE. Therefore, we develop a classification model for automatically assigning LoE to medical publications, which successfully classifies over 26 million documents in MEDLINE database into LoE classes. The subsequent retrieval experiments on TREC PM datasets show substantial improvements in retrieval relevance, when LoE is used as a search filter.
