Table of Contents
Fetching ...

KS-LLM: Knowledge Selection of Large Language Models with Evidence Document for Question Answering

Xinxin Zheng, Feihu Che, Jinyang Wu, Shuai Zhang, Shuai Nie, Kang Liu, Jianhua Tao

TL;DR

This work tackles hallucination in knowledge-intensive QA by using evidence documents more selectively. KS-LLM builds question-centered triples, retrieves the most relevant sentences from evidence documents by aligning them to these triples, and then generates answers using the combination of triples and selected sentences. The approach demonstrates that fusing structured knowledge with textual evidence yields superior results on TriviaQA-verified, WebQuestions, and Natural Questions across multiple open-source LLMs, with ablations highlighting the importance of limited evidence length and a small fixed number of retrieved sentences. The method reduces noise from full-document ingestion and showcases practical gains in accuracy and efficiency for open-domain QA tasks that rely on external knowledge.

Abstract

Large language models (LLMs) suffer from the hallucination problem and face significant challenges when applied to knowledge-intensive tasks. A promising approach is to leverage evidence documents as extra supporting knowledge, which can be obtained through retrieval or generation. However, existing methods directly leverage the entire contents of the evidence document, which may introduce noise information and impair the performance of large language models. To tackle this problem, we propose a novel Knowledge Selection of Large Language Models (KS-LLM) method, aiming to identify valuable information from evidence documents. The KS-LLM approach utilizes triples to effectively select knowledge snippets from evidence documents that are beneficial to answering questions. Specifically, we first generate triples based on the input question, then select the evidence sentences most similar to triples from the evidence document, and finally combine the evidence sentences and triples to assist large language models in generating answers. Experimental comparisons on several question answering datasets, such as TriviaQA, WebQ, and NQ, demonstrate that the proposed method surpasses the baselines and achieves the best results.

KS-LLM: Knowledge Selection of Large Language Models with Evidence Document for Question Answering

TL;DR

This work tackles hallucination in knowledge-intensive QA by using evidence documents more selectively. KS-LLM builds question-centered triples, retrieves the most relevant sentences from evidence documents by aligning them to these triples, and then generates answers using the combination of triples and selected sentences. The approach demonstrates that fusing structured knowledge with textual evidence yields superior results on TriviaQA-verified, WebQuestions, and Natural Questions across multiple open-source LLMs, with ablations highlighting the importance of limited evidence length and a small fixed number of retrieved sentences. The method reduces noise from full-document ingestion and showcases practical gains in accuracy and efficiency for open-domain QA tasks that rely on external knowledge.

Abstract

Large language models (LLMs) suffer from the hallucination problem and face significant challenges when applied to knowledge-intensive tasks. A promising approach is to leverage evidence documents as extra supporting knowledge, which can be obtained through retrieval or generation. However, existing methods directly leverage the entire contents of the evidence document, which may introduce noise information and impair the performance of large language models. To tackle this problem, we propose a novel Knowledge Selection of Large Language Models (KS-LLM) method, aiming to identify valuable information from evidence documents. The KS-LLM approach utilizes triples to effectively select knowledge snippets from evidence documents that are beneficial to answering questions. Specifically, we first generate triples based on the input question, then select the evidence sentences most similar to triples from the evidence document, and finally combine the evidence sentences and triples to assist large language models in generating answers. Experimental comparisons on several question answering datasets, such as TriviaQA, WebQ, and NQ, demonstrate that the proposed method surpasses the baselines and achieves the best results.
Paper Structure (18 sections, 4 equations, 3 figures, 3 tables)

This paper contains 18 sections, 4 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: The large language model generates the incorrect answer with the given evidence document, while obtaining the correct answer with evidence sentences selected from the evidence document.
  • Figure 2: The proposed KS-LLM framework consists of three components: (1) triple construction, (2) evidence sentence selection, and (3) answer generation. The triple construction and answer generation steps are implemented by large language models, while the evidence sentence selection step is implemented by the vector database. The dashed line indicates the input of each step and the solid line indicates the output of each step. Given a question and its corresponding evidence document as input, our method can effectively extract valuable knowledge from the evidence document to acquire the correct answer.
  • Figure 3: Impact of parameter $k$ on TriviaQA-verified and WebQ datasets with Vicuna-13B. We report the exact match (EM) score in the table.