Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts

Wenyu Huang; Guancheng Zhou; Mirella Lapata; Pavlos Vougiouklis; Sebastien Montella; Jeff Z. Pan

Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts

Wenyu Huang, Guancheng Zhou, Mirella Lapata, Pavlos Vougiouklis, Sebastien Montella, Jeff Z. Pan

TL;DR

This work tackles the challenge that LLMs struggle to answer questions involving long-tail facts by grounding them with non-parametric memories. It introduces LTGen, a fully automatic, template-free pipeline that creates two datasets (LTGen-QA and LTGen-Conv) and builds a retrieval-augmented evaluation framework. Through experiments with GPT-3.5 and LLaMA‑2 models, the authors show that prompting with knowledge graph triples yields stronger performance and reduces hallucinations compared to passages, and that combining multiple knowledge sources can further curb errors, albeit with variable gains in knowledge coverage. The study highlights AMR-based ranking as an effective method for selecting relevant KG triples and identifies areas for improvement in entity tagging and grounding reliability, with implications for safer, more grounded LLM-based QA systems.

Abstract

Although Large Language Models (LLMs) are effective in performing various NLP tasks, they still struggle to handle tasks that require extensive, real-world knowledge, especially when dealing with long-tail facts (facts related to long-tail entities). This limitation highlights the need to supplement LLMs with non-parametric knowledge. To address this issue, we analysed the effects of different types of non-parametric knowledge, including textual passage and knowledge graphs (KGs). Since LLMs have probably seen the majority of factual question-answering datasets already, to facilitate our analysis, we proposed a fully automatic pipeline for creating a benchmark that requires knowledge of long-tail facts for answering the involved questions. Using this pipeline, we introduce the LTGen benchmark. We evaluate state-of-the-art LLMs in different knowledge settings using the proposed benchmark. Our experiments show that LLMs alone struggle with answering these questions, especially when the long-tail level is high or rich knowledge is required. Nonetheless, the performance of the same models improved significantly when they were prompted with non-parametric knowledge. We observed that, in most cases, prompting LLMs with KG triples surpasses passage-based prompting using a state-of-the-art retriever. In addition, while prompting LLMs with both KG triples and documents does not consistently improve knowledge coverage, it can dramatically reduce hallucinations in the generated content.

Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts

TL;DR

Abstract

Paper Structure (50 sections, 6 equations, 6 figures, 7 tables)

This paper contains 50 sections, 6 equations, 6 figures, 7 tables.

Introduction
Related Works
Knowledge Bases (KBs)
Question Answering Over Knowledge Bases (KBQA)
Understanding Complex Semantics and Syntax
KIG Benchmarks
Long-tail benchmarks
LTGen: Long-Tail Generation Tasks
Benchmark Construction
Long-tail Entity Sampling
Triples Retrieval
Samples Generation
Data Quality Checking
Non-parametric Memories Collection
Unstructured Knowledge from Passages
...and 35 more sections

Figures (6)

Figure 1: Pipeline overview for generating LTGen-GPT dataset. In the implementation, we use gpt-4-turbo to generate dialogues. Gold highlighted triples are gold triples selected by GPT for use as the external knowledge source to respond current query.
Figure 2: Obtaining non-parametric knowledge from KG.
Figure 3: LLMs' performance on the LTGen benchmark with respect to long-tail level.
Figure 4: LLMs' performance on the LTGen benchmark with respect to reference triple numbers.
Figure 5: LLMs' performance when prompted with different sources of external knowledge on the LTGen benchmark with respect to long-tail level.
...and 1 more figures

Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts

TL;DR

Abstract

Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts

Authors

TL;DR

Abstract

Table of Contents

Figures (6)