Towards Faithful Industrial RAG: A Reinforced Co-adaptation Framework for Advertising QA

Wenwei Li; Ming Xu; Tianle Xia; Lingxiang Hu; Yiding Sun; Linfang Shang; Liqun Liu; Peng Shu; Huan Yu; Jie Jiang

Towards Faithful Industrial RAG: A Reinforced Co-adaptation Framework for Advertising QA

Wenwei Li, Ming Xu, Tianle Xia, Lingxiang Hu, Yiding Sun, Linfang Shang, Liqun Liu, Peng Shu, Huan Yu, Jie Jiang

TL;DR

A reinforced co-adaptation framework that jointly optimizes retrieval and generation through two components: Graph-aware Retrieval (GraphRAG), which models entity-relation structure over a high-citation knowledge subgraph for multi-hop, domain-specific evidence selection, and evidence-constrained reinforcement learning via Group Relative Policy Optimization (GRPO) with multi-dimensional rewards covering faithfulness, style compliance, safety, and URL validity.

Abstract

Industrial advertising question answering (QA) is a high-stakes task in which hallucinated content, particularly fabricated URLs, can lead to financial loss, compliance violations, and legal risk. Although Retrieval-Augmented Generation (RAG) is widely adopted, deploying it in production remains challenging because industrial knowledge is inherently relational, frequently updated, and insufficiently aligned with generation objectives. We propose a reinforced co-adaptation framework that jointly optimizes retrieval and generation through two components: (1) Graph-aware Retrieval (GraphRAG), which models entity-relation structure over a high-citation knowledge subgraph for multi-hop, domain-specific evidence selection; and (2) evidence-constrained reinforcement learning via Group Relative Policy Optimization (GRPO) with multi-dimensional rewards covering faithfulness, style compliance, safety, and URL validity. Experiments on an internal advertising QA dataset show consistent gains across expert-judged dimensions including accuracy, completeness, and safety, while reducing the hallucination rate by 72\%. A two-week online A/B test demonstrates a 28.6\% increase in like rate, a 46.2\% decrease in dislike rate, and a 92.7\% reduction in URL hallucination. The system has been running in production for over half a year and has served millions of QA interactions.

Towards Faithful Industrial RAG: A Reinforced Co-adaptation Framework for Advertising QA

TL;DR

Abstract

Paper Structure (40 sections, 2 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 40 sections, 2 equations, 9 figures, 4 tables, 1 algorithm.

Introduction
Methodology
Problem Formulation
Graph-aware Retrieval
High-Citation Knowledge Base.
GraphRAG Architecture.
Parallel Retrieval Architecture.
Evidence-constrained Generation
Experiments
Experimental Setting
Dataset.
Evaluation Protocol.
Models.
Main Results
GraphRAG Effectiveness
...and 25 more sections

Figures (9)

Figure 1: Traditional QA vs. our approach over a shared knowledge base. Given the same user query and knowledge items A, B, C, D, traditional methods often yield incomplete, hallucinated, over-generated, or verbose answers. Our method produces an exact answer that remains complete, faithful, and concise.
Figure 2: System overview. Given a user query $q$ and a private knowledge base $K$, the retrieval system constructs an evidence set $D$ via two parallel channels: a GraphRAG channel over a high-citation knowledge base $K_h$ and a traditional RAG channel with query rewriting and BGE + BM25 hybrid retrieval. Results are merged and deduplicated. The RL-tuned generator then produces a response optimized by GRPO with multi-dimensional rewards for faithfulness, style compliance, safety, and URL validity.
Figure 3: Knowledge recall enhancement across Base RAG, GraphRAG, and Parallel retrieval. Effective chunks pre query and recall effectiveness in percent.
Figure 4: Training dynamics of multi-dimensional reward components during RL.
Figure 5: FaithEval generalization: accuracy (%) on Inconsistent, Unanswerable, Counterfactual, and Overall.
...and 4 more figures

Towards Faithful Industrial RAG: A Reinforced Co-adaptation Framework for Advertising QA

TL;DR

Abstract

Towards Faithful Industrial RAG: A Reinforced Co-adaptation Framework for Advertising QA

Authors

TL;DR

Abstract

Table of Contents

Figures (9)