Table of Contents
Fetching ...

Bibliometric Data Fusion for Biomedical Information Retrieval

Timo Breuer, Christin Katharina Kreutz, Philipp Schaer, Dirk Tunger

TL;DR

Results on three biomedical retrieval benchmarks from TREC Precision Medicine (TREC-PM) show that bibliometric data fusion is a promising approach to improve retrieval performance in terms of normalized Discounted Cumulated Gain and Average Precision, at the cost of the Precision at 10 rate.

Abstract

Digital libraries in the scientific domain provide users access to a wide range of information to satisfy their diverse information needs. Here, ranking results play a crucial role in users' satisfaction. Exploiting bibliometric metadata, e.g., publications' citation counts or bibliometric indicators in general, for automatically identifying the most relevant results can boost retrieval performance. This work proposes bibliometric data fusion, which enriches existing systems' results by incorporating bibliometric metadata such as citations or altmetrics. Our results on three biomedical retrieval benchmarks from TREC Precision Medicine (TREC-PM) show that bibliometric data fusion is a promising approach to improve retrieval performance in terms of normalized Discounted Cumulated Gain (nDCG) and Average Precision (AP), at the cost of the Precision at 10 (P@10) rate. Patient users especially profit from this lightweight, data-sparse technique that applies to any digital library.

Bibliometric Data Fusion for Biomedical Information Retrieval

TL;DR

Results on three biomedical retrieval benchmarks from TREC Precision Medicine (TREC-PM) show that bibliometric data fusion is a promising approach to improve retrieval performance in terms of normalized Discounted Cumulated Gain and Average Precision, at the cost of the Precision at 10 rate.

Abstract

Digital libraries in the scientific domain provide users access to a wide range of information to satisfy their diverse information needs. Here, ranking results play a crucial role in users' satisfaction. Exploiting bibliometric metadata, e.g., publications' citation counts or bibliometric indicators in general, for automatically identifying the most relevant results can boost retrieval performance. This work proposes bibliometric data fusion, which enriches existing systems' results by incorporating bibliometric metadata such as citations or altmetrics. Our results on three biomedical retrieval benchmarks from TREC Precision Medicine (TREC-PM) show that bibliometric data fusion is a promising approach to improve retrieval performance in terms of normalized Discounted Cumulated Gain (nDCG) and Average Precision (AP), at the cost of the Precision at 10 (P@10) rate. Patient users especially profit from this lightweight, data-sparse technique that applies to any digital library.
Paper Structure (25 sections, 2 equations, 7 figures, 5 tables)

This paper contains 25 sections, 2 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Overview of the analyzed data fusion methods.
  • Figure 2: Methodology based on bibliometric data fusion of rankings and the principle of polyrepresentation.
  • Figure 3: Retrieval effectiveness of fused bibliometric signals including all possible combinations for TREC-PM 2017-2019.
  • Figure 4: Rank fusion-based improvements over the baseline runs for the TREC Precision Medicine Abstract task for 2018 and 2019. BM25 runs marked in orange and named according to an abbreviation of the team's name.
  • Figure 5: Rank fusion-based improvements over the baseline runs for the TREC-PM Abstract task for 2017 with methods using citations (MGB: aCSIROmedMGB, PCB: aCSIROmedPCB, lit5: SIBTMlit5) marked in red. BM25 run marked in orange.
  • ...and 2 more figures