Table of Contents
Fetching ...

CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation

Elahe Khatibi, Ziyu Wang, Amir M. Rahmani

TL;DR

CDF-RAG tackles the limitation of traditional retrieval-augmented generation by injecting explicit causal reasoning into the retrieval and generation loop. It introduces a causal knowledge graph built with UniCausal and GPT-4 validation, a PPO-trained RL agent for dynamic query refinement, and a dual-path retrieval mechanism that fuses semantic and causal signals. A causal graph check and a hallucination-correcting module enforce consistency and grounding in the final outputs. Across CosmosQA, MedQA, MedMCQA, and AdversarialQA, CDF-RAG achieves state-of-the-art accuracy and higher causal grounding as well as groundedness when paired with various backbones, demonstrating the practical value of integrating structured causal reasoning into RAG pipelines.

Abstract

Retrieval-Augmented Generation (RAG) has significantly enhanced large language models (LLMs) in knowledge-intensive tasks by incorporating external knowledge retrieval. However, existing RAG frameworks primarily rely on semantic similarity and correlation-driven retrieval, limiting their ability to distinguish true causal relationships from spurious associations. This results in responses that may be factually grounded but fail to establish cause-and-effect mechanisms, leading to incomplete or misleading insights. To address this issue, we introduce Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation (CDF-RAG), a framework designed to improve causal consistency, factual accuracy, and explainability in generative reasoning. CDF-RAG iteratively refines queries, retrieves structured causal graphs, and enables multi-hop causal reasoning across interconnected knowledge sources. Additionally, it validates responses against causal pathways, ensuring logically coherent and factually grounded outputs. We evaluate CDF-RAG on four diverse datasets, demonstrating its ability to improve response accuracy and causal correctness over existing RAG-based methods. Our code is publicly available at https://github.com/ elakhatibi/CDF-RAG.

CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation

TL;DR

CDF-RAG tackles the limitation of traditional retrieval-augmented generation by injecting explicit causal reasoning into the retrieval and generation loop. It introduces a causal knowledge graph built with UniCausal and GPT-4 validation, a PPO-trained RL agent for dynamic query refinement, and a dual-path retrieval mechanism that fuses semantic and causal signals. A causal graph check and a hallucination-correcting module enforce consistency and grounding in the final outputs. Across CosmosQA, MedQA, MedMCQA, and AdversarialQA, CDF-RAG achieves state-of-the-art accuracy and higher causal grounding as well as groundedness when paired with various backbones, demonstrating the practical value of integrating structured causal reasoning into RAG pipelines.

Abstract

Retrieval-Augmented Generation (RAG) has significantly enhanced large language models (LLMs) in knowledge-intensive tasks by incorporating external knowledge retrieval. However, existing RAG frameworks primarily rely on semantic similarity and correlation-driven retrieval, limiting their ability to distinguish true causal relationships from spurious associations. This results in responses that may be factually grounded but fail to establish cause-and-effect mechanisms, leading to incomplete or misleading insights. To address this issue, we introduce Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation (CDF-RAG), a framework designed to improve causal consistency, factual accuracy, and explainability in generative reasoning. CDF-RAG iteratively refines queries, retrieves structured causal graphs, and enables multi-hop causal reasoning across interconnected knowledge sources. Additionally, it validates responses against causal pathways, ensuring logically coherent and factually grounded outputs. We evaluate CDF-RAG on four diverse datasets, demonstrating its ability to improve response accuracy and causal correctness over existing RAG-based methods. Our code is publicly available at https://github.com/ elakhatibi/CDF-RAG.

Paper Structure

This paper contains 34 sections, 8 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Rethinking Retrieval-Augmented Generation (RAG). (a) Traditional RAG pipelines rely on static queries and keyword- or similarity-based retrieval, often retrieving topically related but causally irrelevant content, which can result in hallucinated or incoherent outputs. (b) CDF-RAG addresses these limitations through reinforcement learning-based query refinement, dual-path retrieval combining semantic vector search with causal graph traversal, and causal-consistent generation, leading to improved factuality and reasoning.
  • Figure 2: Overview of CDF-RAG Framework. (a) The CDF-RAG pipeline refines user queries (LLM + RL), retrieves structured causal and unstructured textual knowledge, applies knowledge rewriting, and ensures factual consistency through causal verification. (b) The PPO-trained query refinement agent optimizes retrieval coverage and causal consistency.
  • Figure 3: Groundedness comparison of different methods across four LLMs on the MedQA dataset.
  • Figure 4: Ablation study of CDF-RAG across incremental stages. Left: performance metrics including CRC, SRS, groundedness, and F1 score. Right: HR, where lower values indicate greater factual consistency.