Integrating Large Language Models with Graph-based Reasoning for Conversational Question Answering
Parag Jain, Mirella Lapata
TL;DR
This work tackles conversational question answering over heterogeneous sources, requiring context tracking and robust reasoning across text, infoboxes, tables, and knowledge graphs. It introduces a dynamic graph representation of retrieved evidence and learns graph embeddings that are injected into a large language model, augmented by a memory module that stores past evidence to guide future reasoning. The model is trained end-to-end with cross-entropy, using a two-stage retrieval pipeline (Wikidata via CLOCQ and Wikipedia) to build the evidence graph, and a Graph Attention Network to reason over the graph before querying the LLM. On ConvMix, graph embeddings improve reasoning over multiple sources, and the memory module enhances robustness to noise and retrieval errors, with the best results achieved by Mistral-7B + Graph + Memory.
Abstract
We focus on a conversational question answering task which combines the challenges of understanding questions in context and reasoning over evidence gathered from heterogeneous sources like text, knowledge graphs, tables, and infoboxes. Our method utilizes a graph structured representation to aggregate information about a question and its context (i.e., the conversation so far and evidence retrieved to find an answer), while also harnessing the reasoning and text generation capabilities of large language models (LLMs). Graph embeddings are directly injected into the LLM, bypassing the token embedding layers, and learned end-to-end by minimizing cross-entropy. Our model maintains a memory module to track and update past evidence, thus influencing the graph's structure, as the conversation evolves. Experimental results on the ConvMix benchmark(Christmann et al., 2022a) show that graph embeddings enhance the LLM's ability to reason, while the memory module provides robustness against noise and retrieval errors.
