Table of Contents
Fetching ...

CardioEmbed: Domain-Specialized Text Embeddings for Clinical Cardiology

Richard J. Young, Alice M. Matthews

TL;DR

CardioEmbed demonstrates that domain-specific embeddings trained on comprehensive clinical textbooks significantly improve cardiology-focused semantic retrieval and retrieval-related tasks. By fine-tuning a strong foundation model (Qwen3-Embedding-8B) on ~150,000 cardiology textbook sentences using contrastive learning with InfoNCE, and employing EOS pooling and LoRA-based efficiency, the model achieves near-perfect cardiology retrieval (Acc@1 of 99.60%), substantially outperforming PubMed-centric baselines. While maintaining competitive performance on broader biomedical benchmarks (MTEB BIOSSES 0.7748, SciFact 0.609), CardioEmbed highlights the value of depth in domain knowledge for specialized clinical applications. The work suggests that textbook-based domain specialization can meaningfully improve clinical information retrieval and decision-support related tasks, though real-world deployment requires further validation and integration with clinical reasoning systems.

Abstract

Biomedical text embeddings have primarily been developed using research literature from PubMed, yet clinical cardiology practice relies heavily on procedural knowledge and specialized terminology found in comprehensive textbooks rather than research abstracts. This research practice gap limits the effectiveness of existing embedding models for clinical applications incardiology. This study trained CardioEmbed, a domain-specialized embedding model based on Qwen3-Embedding-8B, using contrastive learning on a curated corpus of seven comprehensive cardiology textbooks totaling approximately 150,000 sentences after deduplication. The model employs InfoNCE loss with in-batch negatives and achieves 99.60% retrieval accuracy on cardiac-specific semantic retrieval tasks, a +15.94 percentage point improvement over MedTE, the current state-of-the-art medical embedding model. On MTEB medical benchmarks, the model obtained BIOSSES 0.77 Spearman and SciFact 0.61 NDCG@10, indicating competitive performance on related biomedical domains. Domain-specialized training on comprehensive clinical textbooks yields near-perfect cardiology retrieval (99.60% Acc@1), improving over MedTE by +15.94 percentage points.

CardioEmbed: Domain-Specialized Text Embeddings for Clinical Cardiology

TL;DR

CardioEmbed demonstrates that domain-specific embeddings trained on comprehensive clinical textbooks significantly improve cardiology-focused semantic retrieval and retrieval-related tasks. By fine-tuning a strong foundation model (Qwen3-Embedding-8B) on ~150,000 cardiology textbook sentences using contrastive learning with InfoNCE, and employing EOS pooling and LoRA-based efficiency, the model achieves near-perfect cardiology retrieval (Acc@1 of 99.60%), substantially outperforming PubMed-centric baselines. While maintaining competitive performance on broader biomedical benchmarks (MTEB BIOSSES 0.7748, SciFact 0.609), CardioEmbed highlights the value of depth in domain knowledge for specialized clinical applications. The work suggests that textbook-based domain specialization can meaningfully improve clinical information retrieval and decision-support related tasks, though real-world deployment requires further validation and integration with clinical reasoning systems.

Abstract

Biomedical text embeddings have primarily been developed using research literature from PubMed, yet clinical cardiology practice relies heavily on procedural knowledge and specialized terminology found in comprehensive textbooks rather than research abstracts. This research practice gap limits the effectiveness of existing embedding models for clinical applications incardiology. This study trained CardioEmbed, a domain-specialized embedding model based on Qwen3-Embedding-8B, using contrastive learning on a curated corpus of seven comprehensive cardiology textbooks totaling approximately 150,000 sentences after deduplication. The model employs InfoNCE loss with in-batch negatives and achieves 99.60% retrieval accuracy on cardiac-specific semantic retrieval tasks, a +15.94 percentage point improvement over MedTE, the current state-of-the-art medical embedding model. On MTEB medical benchmarks, the model obtained BIOSSES 0.77 Spearman and SciFact 0.61 NDCG@10, indicating competitive performance on related biomedical domains. Domain-specialized training on comprehensive clinical textbooks yields near-perfect cardiology retrieval (99.60% Acc@1), improving over MedTE by +15.94 percentage points.

Paper Structure

This paper contains 18 sections, 1 equation, 6 figures, 2 tables.

Figures (6)

  • Figure 1: MTEB medical benchmark performance visualization for CardioEmbed (higher is better). CardioEmbed achieved 0.77 Spearman correlation on BIOSSES (biomedical similarity) and 0.61 NDCG@10 on SciFact (scientific verification). Performance on NFCorpus (0.20 NDCG@10, general medical retrieval) is shown for comparison. Color zones indicate Strong (green, $>$0.6), Moderate (orange, 0.3--0.6), and areas requiring improvement (red, $<$0.3).
  • Figure 2: Cardiology retrieval performance comparison across five embedding models (higher is better). CardioEmbed achieves 99.60% Acc@1, representing +15.94% improvement over MedTE (SOTA medical model).
  • Figure 3: Retrieval accuracy at different ranks (higher is better). CardioEmbed (fine‑tuned) achieved 99.6%--100% across all ranks, while the base model showed lower accuracy at all rank thresholds.
  • Figure 4: Mean Reciprocal Rank (MRR) comparison across embedding models. CardioEmbed achieves 0.9976 MRR, approaching perfect ranking performance (1.0) and outperforming MedTE (0.8611), GTE-Base (0.9401), MedEmbed (0.9313), and base Qwen3-8B (0.9506). Higher MRR indicates that correct matches are consistently ranked at or near the top position.
  • Figure 5: Performance improvement waterfall showing incremental contributions to CardioEmbed's final accuracy (higher is better). Starting from MedTE baseline (83.66%), switching to Qwen3-8B foundation model provided +10.17% gain, while cardiology-specific fine-tuning added +5.77%, achieving final performance of 99.60%. The waterfall visualization demonstrates that both foundation model selection and domain specialization contribute substantially to the final outcome.
  • ...and 1 more figures