QIME: Constructing Interpretable Medical Text Embeddings via Ontology-Grounded Questions

Yixuan Tang; Zhenghong Lin; Yandong Sun; Wynne Hsu; Mong Li Lee; Anthony K. H. Tung

QIME: Constructing Interpretable Medical Text Embeddings via Ontology-Grounded Questions

Yixuan Tang, Zhenghong Lin, Yandong Sun, Wynne Hsu, Mong Li Lee, Anthony K. H. Tung

TL;DR

QIME is proposed, an ontology-grounded framework for constructing interpretable medical text embeddings in which each dimension corresponds to a clinically meaningful yes/no question and supports a training-free embedding construction strategy that eliminates per-question classifier training while further improving performance.

Abstract

While dense biomedical embeddings achieve strong performance, their black-box nature limits their utility in clinical decision-making. Recent question-based interpretable embeddings represent text as binary answers to natural-language questions, but these approaches often rely on heuristic or surface-level contrastive signals and overlook specialized domain knowledge. We propose QIME, an ontology-grounded framework for constructing interpretable medical text embeddings in which each dimension corresponds to a clinically meaningful yes/no question. By conditioning on cluster-specific medical concept signatures, QIME generates semantically atomic questions that capture fine-grained distinctions in biomedical text. Furthermore, QIME supports a training-free embedding construction strategy that eliminates per-question classifier training while further improving performance. Experiments across biomedical semantic similarity, clustering, and retrieval benchmarks show that QIME consistently outperforms prior interpretable embedding methods and substantially narrows the gap to strong black-box biomedical encoders, while providing concise and clinically informative explanations.

QIME: Constructing Interpretable Medical Text Embeddings via Ontology-Grounded Questions

TL;DR

Abstract

Paper Structure (34 sections, 3 equations, 3 figures, 4 tables)

This paper contains 34 sections, 3 equations, 3 figures, 4 tables.

Introduction
Related Work
Black-Box Text Embeddings
Interpretable Text Embeddings
The QIME Framework
Overview
Task Formulation
Ontology-Grounded Question Generation
Semantic Clustering of the Medical Corpus.
Cluster-Level Ontology Grounding.
Grounded Contrastive Question Generation.
Interpretable Medical Embedding Construction
Classifier-based Embedding Construction.
Training-Free Sparse Embedding Construction.
Experiments
...and 19 more sections

Figures (3)

Figure 1: Comparing existing text embedding with the proposed framework.
Figure 2: Overview of the QIME framework.
Figure 3: Effect of the top-$k$ parameter in training-free embedding construction. We report performance on clustering (V-measure), STS (Spearman correlation), and retrieval (nDCG@10) benchmarks.

QIME: Constructing Interpretable Medical Text Embeddings via Ontology-Grounded Questions

TL;DR

Abstract

QIME: Constructing Interpretable Medical Text Embeddings via Ontology-Grounded Questions

Authors

TL;DR

Abstract

Table of Contents

Figures (3)