Table of Contents
Fetching ...

Biomedical Hypothesis Explainability with Graph-Based Context Retrieval

Ilya Tyagin, Saeideh Valipour, Aliaksandra Sikirzhytskaya, Michael Shtutman, Ilya Safro

TL;DR

The paper presents HGCR, a graph-based retrieval framework for explainable biomedical hypothesis generation that builds a temporal co-occurrence network from MEDLINE using UMLS CUIs, samples and ranks semantic paths between concepts, and couples these paths with LLM-generated explanations. A novel feedback loop using AGATHA and SemRep validates and refines the explanations by iteratively updating evidence contexts to reduce unsupported claims. The approach is evaluated on a temporal benchmark with expert-case studies and automated metrics, showing improved explainability alignment with future literature and reduced error rates when the feedback loop is active. The work highlights the value of integrating structured knowledge graphs with retrieval-augmented generation for trustworthy, interpretable biomedical hypothesis discovery, and provides a publicly available implementation.

Abstract

We introduce an explainability method for biomedical hypothesis generation systems, built on top of the novel Hypothesis Generation Context Retriever framework. Our approach combines semantic graph-based retrieval and relevant data-restrictive training to simulate real-world discovery constraints. Integrated with large language models (LLMs) via retrieval-augmented generation, the system explains hypotheses with contextual evidence using published scientific literature. We also propose a novel feedback loop approach, which iteratively identifies and corrects flawed parts of LLM-generated explanations, refining both the evidence paths and supporting context. We demonstrate the performance of our method with multiple large language models and evaluate the explanation and context retrieval quality through both expert-curated assessment and large-scale automated analysis. Our code is available at: https://github.com/IlyaTyagin/HGCR.

Biomedical Hypothesis Explainability with Graph-Based Context Retrieval

TL;DR

The paper presents HGCR, a graph-based retrieval framework for explainable biomedical hypothesis generation that builds a temporal co-occurrence network from MEDLINE using UMLS CUIs, samples and ranks semantic paths between concepts, and couples these paths with LLM-generated explanations. A novel feedback loop using AGATHA and SemRep validates and refines the explanations by iteratively updating evidence contexts to reduce unsupported claims. The approach is evaluated on a temporal benchmark with expert-case studies and automated metrics, showing improved explainability alignment with future literature and reduced error rates when the feedback loop is active. The work highlights the value of integrating structured knowledge graphs with retrieval-augmented generation for trustworthy, interpretable biomedical hypothesis discovery, and provides a publicly available implementation.

Abstract

We introduce an explainability method for biomedical hypothesis generation systems, built on top of the novel Hypothesis Generation Context Retriever framework. Our approach combines semantic graph-based retrieval and relevant data-restrictive training to simulate real-world discovery constraints. Integrated with large language models (LLMs) via retrieval-augmented generation, the system explains hypotheses with contextual evidence using published scientific literature. We also propose a novel feedback loop approach, which iteratively identifies and corrects flawed parts of LLM-generated explanations, refining both the evidence paths and supporting context. We demonstrate the performance of our method with multiple large language models and evaluate the explanation and context retrieval quality through both expert-curated assessment and large-scale automated analysis. Our code is available at: https://github.com/IlyaTyagin/HGCR.

Paper Structure

This paper contains 63 sections, 12 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Overview diagram of the proposed pipeline based on Hypothesis Generation Context Retriever (HGCR) and self-correction loop for context refinement.
  • Figure 2: Overview of the proposed HGCR framework. Input is a hypothesis: a pair of biomedical concepts $(m_i, m_j)$. The system samples paths from the network $G$ and ranks them based on their predicted alignment with the future reference context.
  • Figure 3: Distribution of positive and negative path counts per query in the test set. Queries $q$ were extracted from the timestamp $t$ corresponding to 2022 (and onwards) and the shortest paths between them were sampled from $G_{t-1}$, representing the graph state of 2021.
  • Figure 4: Relationship between HGCR Context Score (horizontal axis) and semantic similarity (MedCPT Dot Product, vertical axis) between final LLM-generated explanations (with feedback loop) and future reference scientific abstracts.
  • Figure 5: Different metrics across context sizes ($k$) in the ablation study. Larger context size tends to improve similarity metrics.