Table of Contents
Fetching ...

SRAG: Structured Retrieval-Augmented Generation for Multi-Entity Question Answering over Wikipedia Graph

Teng Lin, Yizhang Zhu, Yuyu Luo, Nan Tang

TL;DR

This paper addresses MEQA by formalizing a Wikipedia graph and proposing SRAG, a two-part architecture that decouples retrieval from reasoning and converts retrieved entities into relational tables for table-based analysis. The Multi-entity Semantic Retrieval component uses SPARQL queries guided by GPT-4 and a semantic analyser to accurately identify entities and properties, while Structured QA (SQA) generates and populates schema-aligned tables, followed by an executor that yields final answers via SQL. The authors introduce MEBench, a Wikipedia-based MEQA benchmark, and demonstrate that SRAG achieves state-of-the-art accuracy on this benchmark, substantially outperforming GPT-4 + RAG baselines across eight subtasks. The results highlight the value of structuring unstructured knowledge to boost LLM reasoning in MEQA and point to future work on enhancing semantic parsing, information extraction, and ambiguous-query handling.

Abstract

Multi-entity question answering (MEQA) poses significant challenges for large language models (LLMs), which often struggle to consolidate scattered information across multiple documents. An example question might be "What is the distribution of IEEE Fellows among various fields of study?", which requires retrieving information from diverse sources e.g., Wikipedia pages. The effectiveness of current retrieval-augmented generation (RAG) methods is limited by the LLMs' capacity to aggregate insights from numerous pages. To address this gap, this paper introduces a structured RAG (SRAG) framework that systematically organizes extracted entities into relational tables (e.g., tabulating entities with schema columns like "name" and "field of study") and then apply table-based reasoning techniques. Our approach decouples retrieval and reasoning, enabling LLMs to focus on structured data analysis rather than raw text aggregation. Extensive experiments on Wikipedia-based multi-entity QA tasks demonstrate that SRAG significantly outperforms state-of-the-art long-context LLMs and RAG solutions, achieving a 29.6% improvement in accuracy. The results underscore the efficacy of structuring unstructured data to enhance LLMs' reasoning capabilities.

SRAG: Structured Retrieval-Augmented Generation for Multi-Entity Question Answering over Wikipedia Graph

TL;DR

This paper addresses MEQA by formalizing a Wikipedia graph and proposing SRAG, a two-part architecture that decouples retrieval from reasoning and converts retrieved entities into relational tables for table-based analysis. The Multi-entity Semantic Retrieval component uses SPARQL queries guided by GPT-4 and a semantic analyser to accurately identify entities and properties, while Structured QA (SQA) generates and populates schema-aligned tables, followed by an executor that yields final answers via SQL. The authors introduce MEBench, a Wikipedia-based MEQA benchmark, and demonstrate that SRAG achieves state-of-the-art accuracy on this benchmark, substantially outperforming GPT-4 + RAG baselines across eight subtasks. The results highlight the value of structuring unstructured knowledge to boost LLM reasoning in MEQA and point to future work on enhancing semantic parsing, information extraction, and ambiguous-query handling.

Abstract

Multi-entity question answering (MEQA) poses significant challenges for large language models (LLMs), which often struggle to consolidate scattered information across multiple documents. An example question might be "What is the distribution of IEEE Fellows among various fields of study?", which requires retrieving information from diverse sources e.g., Wikipedia pages. The effectiveness of current retrieval-augmented generation (RAG) methods is limited by the LLMs' capacity to aggregate insights from numerous pages. To address this gap, this paper introduces a structured RAG (SRAG) framework that systematically organizes extracted entities into relational tables (e.g., tabulating entities with schema columns like "name" and "field of study") and then apply table-based reasoning techniques. Our approach decouples retrieval and reasoning, enabling LLMs to focus on structured data analysis rather than raw text aggregation. Extensive experiments on Wikipedia-based multi-entity QA tasks demonstrate that SRAG significantly outperforms state-of-the-art long-context LLMs and RAG solutions, achieving a 29.6% improvement in accuracy. The results underscore the efficacy of structuring unstructured data to enhance LLMs' reasoning capabilities.

Paper Structure

This paper contains 17 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: An Overview of Multi-entity QA Solutions over Wikipedia Graph. (a) Multi-Entity Retrieval: In a1, a rough SPARQL query is generated using language model (GPT-4). In a2, we integrate the LLM's semantic parsing with Wikipedia's API and utilize verifiable query accuracy on structured Wikidata to accurately identify entities and properties. In step a3, we synthesize an exact SPARQL query. Finally, in a4, the refined SPARQL query is used to retrieve the relevant entities and web pages. (b) Existing Reasoning Solutions: b1 represents direct responses from LLMs, while b2 combines LLMs with RAG. (c) Our proposal: Structured QA. Initially, in step c1, a language model (GPT-4) is employed to analyze the question and determine the table schema. In c3, we utilize an information extraction module to populate the table. Finally, in step c4, the TableQA module is used to derive the final answer.
  • Figure 2: Illustration of a Wikipedia graph snippet.
  • Figure 3: Experimental results for eight types queries of each model.