Table of Contents
Fetching ...

Graph Neural Network Enhanced Retrieval for Question Answering of LLMs

Zijian Li, Qingyan Guo, Jiawei Shao, Lei Song, Jiang Bian, Jun Zhang, Rui Wang

TL;DR

A novel retrieval method, called GNN-Ret, which leverages graph neural networks (GNNs) to enhance retrieval by exploiting the relatedness between passages, and is extended to handle multi-hop reasoning questions using a recurrent graph neural network (RGNN), named RGNN-Ret.

Abstract

Retrieval augmented generation has revolutionized large language model (LLM) outputs by providing factual supports. Nevertheless, it struggles to capture all the necessary knowledge for complex reasoning questions. Existing retrieval methods typically divide reference documents into passages, treating them in isolation. These passages, however, are often interrelated, such as passages that are contiguous or share the same keywords. Therefore, it is crucial to recognize such relatedness for enhancing the retrieval process. In this paper, we propose a novel retrieval method, called GNN-Ret, which leverages graph neural networks (GNNs) to enhance retrieval by exploiting the relatedness between passages. Specifically, we first construct a graph of passages by connecting passages that are structure-related or keyword-related. A graph neural network (GNN) is then leveraged to exploit the relationships between passages and improve the retrieval of supporting passages. Furthermore, we extend our method to handle multi-hop reasoning questions using a recurrent graph neural network (RGNN), named RGNN-Ret. At each step, RGNN-Ret integrates the graphs of passages from previous steps, thereby enhancing the retrieval of supporting passages. Extensive experiments on benchmark datasets demonstrate that GNN-Ret achieves higher accuracy for question answering with a single query of LLMs than strong baselines that require multiple queries, and RGNN-Ret further improves accuracy and achieves state-of-the-art performance, with up to 10.4% accuracy improvement on the 2WikiMQA dataset.

Graph Neural Network Enhanced Retrieval for Question Answering of LLMs

TL;DR

A novel retrieval method, called GNN-Ret, which leverages graph neural networks (GNNs) to enhance retrieval by exploiting the relatedness between passages, and is extended to handle multi-hop reasoning questions using a recurrent graph neural network (RGNN), named RGNN-Ret.

Abstract

Retrieval augmented generation has revolutionized large language model (LLM) outputs by providing factual supports. Nevertheless, it struggles to capture all the necessary knowledge for complex reasoning questions. Existing retrieval methods typically divide reference documents into passages, treating them in isolation. These passages, however, are often interrelated, such as passages that are contiguous or share the same keywords. Therefore, it is crucial to recognize such relatedness for enhancing the retrieval process. In this paper, we propose a novel retrieval method, called GNN-Ret, which leverages graph neural networks (GNNs) to enhance retrieval by exploiting the relatedness between passages. Specifically, we first construct a graph of passages by connecting passages that are structure-related or keyword-related. A graph neural network (GNN) is then leveraged to exploit the relationships between passages and improve the retrieval of supporting passages. Furthermore, we extend our method to handle multi-hop reasoning questions using a recurrent graph neural network (RGNN), named RGNN-Ret. At each step, RGNN-Ret integrates the graphs of passages from previous steps, thereby enhancing the retrieval of supporting passages. Extensive experiments on benchmark datasets demonstrate that GNN-Ret achieves higher accuracy for question answering with a single query of LLMs than strong baselines that require multiple queries, and RGNN-Ret further improves accuracy and achieves state-of-the-art performance, with up to 10.4% accuracy improvement on the 2WikiMQA dataset.
Paper Structure (30 sections, 19 equations, 5 figures, 10 tables)

This paper contains 30 sections, 19 equations, 5 figures, 10 tables.

Figures (5)

  • Figure 1: Overview of comparison between dense retrieval and GNN-Ret. The shared keywords and ground-truth answers are highlighted in yellow and green, respectively. By considering the relatedness between passages, GNN-Ret can retrieve all the supporting passages for QA.
  • Figure 2: Accuracy and average number of LLM queries for our proposed methods and baselines on 2WikiMQA.
  • Figure 3: Illustration of GNN-Ret and RGNN-Ret.
  • Figure 4: Accuracy of GNN-Ret and RGNN-Ret with various $K$ and $O$ in Musique and IIRC datasets. The average accuracy of two datasets are displayed in green points.
  • Figure 5: Exact-match number of test samples with varying numbers of supporting passages required for QA on MuSiQue and 2WikiMQA datasets.