Table of Contents
Fetching ...

Revisiting the Solution of Meta KDD Cup 2024: CRAG

Jie Ouyang, Yucong Luo, Mingyue Cheng, Daoyu Wang, Shuo Yu, Qi Liu, Enhong Chen

TL;DR

The paper tackles the challenge of reliable QA with Retrieval-Augmented Generation (RAG) by evaluating and improving RAG under the CRAG benchmark. It introduces a routing-based, domain- and dynamism-aware three-stage pipeline (retrieval, augmentation, generation) featuring Domain and Dynamism Routers, specialized handling of Web Pages and Mock APIs, and generation-time strategies like Chain-of-Thought and In-context Learning. Key contributions include a comprehensive retrieval/augmentation framework, extensive ablation studies, and practical insights into managing dynamic information and API integration in RAG. The approach demonstrates strong performance on CRAG Task 2 and Task 3, highlighting improvements in accuracy and reductions in hallucinations, with discussions on cognitive evaluation and scalable API integration for real-world deployment.

Abstract

This paper presents the solution of our team APEX in the Meta KDD CUP 2024: CRAG Comprehensive RAG Benchmark Challenge. The CRAG benchmark addresses the limitations of existing QA benchmarks in evaluating the diverse and dynamic challenges faced by Retrieval-Augmented Generation (RAG) systems. It provides a more comprehensive assessment of RAG performance and contributes to advancing research in this field. We propose a routing-based domain and dynamic adaptive RAG pipeline, which performs specific processing for the diverse and dynamic nature of the question in all three stages: retrieval, augmentation, and generation. Our method achieved superior performance on CRAG and ranked 2nd for Task 2&3 on the final competition leaderboard. Our implementation is available at this link: https://github.com/USTCAGI/CRAG-in-KDD-Cup2024.

Revisiting the Solution of Meta KDD Cup 2024: CRAG

TL;DR

The paper tackles the challenge of reliable QA with Retrieval-Augmented Generation (RAG) by evaluating and improving RAG under the CRAG benchmark. It introduces a routing-based, domain- and dynamism-aware three-stage pipeline (retrieval, augmentation, generation) featuring Domain and Dynamism Routers, specialized handling of Web Pages and Mock APIs, and generation-time strategies like Chain-of-Thought and In-context Learning. Key contributions include a comprehensive retrieval/augmentation framework, extensive ablation studies, and practical insights into managing dynamic information and API integration in RAG. The approach demonstrates strong performance on CRAG Task 2 and Task 3, highlighting improvements in accuracy and reductions in hallucinations, with discussions on cognitive evaluation and scalable API integration for real-world deployment.

Abstract

This paper presents the solution of our team APEX in the Meta KDD CUP 2024: CRAG Comprehensive RAG Benchmark Challenge. The CRAG benchmark addresses the limitations of existing QA benchmarks in evaluating the diverse and dynamic challenges faced by Retrieval-Augmented Generation (RAG) systems. It provides a more comprehensive assessment of RAG performance and contributes to advancing research in this field. We propose a routing-based domain and dynamic adaptive RAG pipeline, which performs specific processing for the diverse and dynamic nature of the question in all three stages: retrieval, augmentation, and generation. Our method achieved superior performance on CRAG and ranked 2nd for Task 2&3 on the final competition leaderboard. Our implementation is available at this link: https://github.com/USTCAGI/CRAG-in-KDD-Cup2024.
Paper Structure (24 sections, 3 figures, 4 tables)

This paper contains 24 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: The overall pipeline of our solution.
  • Figure 2: The pipeline of Web Retriever.
  • Figure 3: The pipeline of API Extractor.