Factuality and Transparency Are All RAG Needs! Self-Explaining Contrastive Evidence Re-ranking
Francielle Vargas, Daniel Pedronette
TL;DR
The paper addresses the unreliability of retrieval-augmented generation in safety-critical domains by introducing Self-Explaining Contrastive Evidence Re-ranking (CER). CER uses a two-stage approach: contrastive retrieval with an evidence-sensitive embedding space trained via triplet loss, and self-explaining re-ranking that provides token-level attributions to expose evidential reasoning, feeding an LLM to generate grounded responses. On clinical trial data, CER yields improved retrieval accuracy and clearer differentiation between evidential and non-evidential content, reducing potential hallucinations and increasing transparency. This work advances reliable, evidence-based RAG in healthcare and offers a framework for interpretable, evidence-grounded AI pipelines.
Abstract
This extended abstract introduces Self-Explaining Contrastive Evidence Re-Ranking (CER), a novel method that restructures retrieval around factual evidence by fine-tuning embeddings with contrastive learning and generating token-level attribution rationales for each retrieved passage. Hard negatives are automatically selected using a subjectivity-based criterion, forcing the model to pull factual rationales closer while pushing subjective or misleading explanations apart. As a result, the method creates an embedding space explicitly aligned with evidential reasoning. We evaluated our method on clinical trial reports, and initial experimental results show that CER improves retrieval accuracy, mitigates the potential for hallucinations in RAG systems, and provides transparent, evidence-based retrieval that enhances reliability, especially in safety-critical domains.
