Table of Contents
Fetching ...

Qibitz: Mining PubMed for Repurposable Drugs

David Massart, Marc Zeicher

TL;DR

This work tackles the difficulty of systematically identifying repurposable drugs for specific pathologies via PubMed searches. It introduces Qibitz, a faceted search interface powered by two new specialized indexes: a drug field derived from a Belgic BelMed vocabulary and a gene field built on HGNC symbols, enriched through MeSH mappings, registry data, and unstructured-text extraction with LLM-based disambiguation. An ETL pipeline (extract, transform, load) automates data ingestion from PubMed into MongoDB and Elasticsearch, enabling a front-end faceted search that surfaces drug–gene and drug–pathology associations. The authors demonstrate the approach with three case studies (mucosal melanoma, Parkinson disease, and mucoepidermoid carcinoma), showing how Qibitz can reveal repurposed add-on therapies and new candidate drugs, thereby accelerating hypothesis generation for clinical investigation. Overall, the paper provides a practical, data-driven workflow for pattern discovery in literature metadata to support drug repurposing and collaboration between researchers and clinicians.

Abstract

PubMed's current search interface makes it tedious to systematically search for medical and research literature on drugs that could potentially be used to treat a given pathology, including patients with genetically altered tumors. This is because physicians must search separately for each drug-pathology combination (or drug-gene combination). To streamline this process, this paper proposes adding a faceted search interface to PubMed. Faceted search is a common feature on e-commerce websites that allows users to filter search results by selecting different fields. By incorporating this technology, not only can physicians save time and improve the accuracy of their literature searches, but also presenting search results in this way makes patterns emerge, which can suggest new treatment options for a given pathology (including patients with genetically altered tumors).

Qibitz: Mining PubMed for Repurposable Drugs

TL;DR

This work tackles the difficulty of systematically identifying repurposable drugs for specific pathologies via PubMed searches. It introduces Qibitz, a faceted search interface powered by two new specialized indexes: a drug field derived from a Belgic BelMed vocabulary and a gene field built on HGNC symbols, enriched through MeSH mappings, registry data, and unstructured-text extraction with LLM-based disambiguation. An ETL pipeline (extract, transform, load) automates data ingestion from PubMed into MongoDB and Elasticsearch, enabling a front-end faceted search that surfaces drug–gene and drug–pathology associations. The authors demonstrate the approach with three case studies (mucosal melanoma, Parkinson disease, and mucoepidermoid carcinoma), showing how Qibitz can reveal repurposed add-on therapies and new candidate drugs, thereby accelerating hypothesis generation for clinical investigation. Overall, the paper provides a practical, data-driven workflow for pattern discovery in literature metadata to support drug repurposing and collaboration between researchers and clinicians.

Abstract

PubMed's current search interface makes it tedious to systematically search for medical and research literature on drugs that could potentially be used to treat a given pathology, including patients with genetically altered tumors. This is because physicians must search separately for each drug-pathology combination (or drug-gene combination). To streamline this process, this paper proposes adding a faceted search interface to PubMed. Faceted search is a common feature on e-commerce websites that allows users to filter search results by selecting different fields. By incorporating this technology, not only can physicians save time and improve the accuracy of their literature searches, but also presenting search results in this way makes patterns emerge, which can suggest new treatment options for a given pathology (including patients with genetically altered tumors).

Paper Structure

This paper contains 25 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: PubMed data acquisition and transformation workflow.