Table of Contents
Fetching ...

Large Language Models as Source Planner for Personalized Knowledge-grounded Dialogue

Hongru Wang, Minda Hu, Yang Deng, Rui Wang, Fei Mi, Weichao Wang, Yasheng Wang, Wai-Chung Kwan, Irwin King, Kam-Fai Wong

TL;DR

The paper tackles the challenge of open-domain, personalized dialogue by modeling dependencies across multiple knowledge sources. It introduces SAFARI, a planning-retrieval-assembling pipeline that uses LLMs to decide when and which sources to invoke and in what order, while decoupling grounding from response generation. To study source interactions, the authors construct KBP, a dataset capturing persona-knowledge dependencies and grounding labels. Empirical results show that SAFARI, especially in supervised settings, improves persona-consistency and knowledge grounding, with the analysis highlighting the importance of accurate planning and dependency handling. The approach offers a scalable framework for integrating diverse knowledge sources in dialogue systems and points to future work on mitigating error propagation and expanding source coverage.

Abstract

Open-domain dialogue system usually requires different sources of knowledge to generate more informative and evidential responses. However, existing knowledge-grounded dialogue systems either focus on a single knowledge source or overlook the dependency between multiple sources of knowledge, which may result in generating inconsistent or even paradoxical responses. To incorporate multiple knowledge sources and dependencies between them, we propose SAFARI, a novel framework that leverages the exceptional capabilities of large language models (LLMs) in planning, understanding, and incorporating under both supervised and unsupervised settings. Specifically, SAFARI decouples the knowledge grounding into multiple sources and response generation, which allows easy extension to various knowledge sources including the possibility of not using any sources. To study the problem, we construct a personalized knowledge-grounded dialogue dataset \textit{\textbf{K}nowledge \textbf{B}ehind \textbf{P}ersona}~(\textbf{KBP}), which is the first to consider the dependency between persona and implicit knowledge. Experimental results on the KBP dataset demonstrate that the SAFARI framework can effectively produce persona-consistent and knowledge-enhanced responses.

Large Language Models as Source Planner for Personalized Knowledge-grounded Dialogue

TL;DR

The paper tackles the challenge of open-domain, personalized dialogue by modeling dependencies across multiple knowledge sources. It introduces SAFARI, a planning-retrieval-assembling pipeline that uses LLMs to decide when and which sources to invoke and in what order, while decoupling grounding from response generation. To study source interactions, the authors construct KBP, a dataset capturing persona-knowledge dependencies and grounding labels. Empirical results show that SAFARI, especially in supervised settings, improves persona-consistency and knowledge grounding, with the analysis highlighting the importance of accurate planning and dependency handling. The approach offers a scalable framework for integrating diverse knowledge sources in dialogue systems and points to future work on mitigating error propagation and expanding source coverage.

Abstract

Open-domain dialogue system usually requires different sources of knowledge to generate more informative and evidential responses. However, existing knowledge-grounded dialogue systems either focus on a single knowledge source or overlook the dependency between multiple sources of knowledge, which may result in generating inconsistent or even paradoxical responses. To incorporate multiple knowledge sources and dependencies between them, we propose SAFARI, a novel framework that leverages the exceptional capabilities of large language models (LLMs) in planning, understanding, and incorporating under both supervised and unsupervised settings. Specifically, SAFARI decouples the knowledge grounding into multiple sources and response generation, which allows easy extension to various knowledge sources including the possibility of not using any sources. To study the problem, we construct a personalized knowledge-grounded dialogue dataset \textit{\textbf{K}nowledge \textbf{B}ehind \textbf{P}ersona}~(\textbf{KBP}), which is the first to consider the dependency between persona and implicit knowledge. Experimental results on the KBP dataset demonstrate that the SAFARI framework can effectively produce persona-consistent and knowledge-enhanced responses.
Paper Structure (34 sections, 7 equations, 2 figures, 12 tables)

This paper contains 34 sections, 7 equations, 2 figures, 12 tables.

Figures (2)

  • Figure 1: (a) An example of dependency of two sources involved in the persona-consistent dialogue system (PERSONA and DOCUMENTS); (b) our proposed SAFARI framework to plan, retrieve, and incorporate multiple sources of knowledge: PERSONA, DOCUMENTS, and so on. Planning, Retrieval and Assembling steps are divided by dashed lines; (c) A sample from the KBP dataset. There are three situations of responses in our datasets: 1) response without the need for any sources (NULL), 2) response using only personae description (from PERSONA source), and 3) response using both persona and knowledge (from PERSONA, DOCUMENTS sources). The example here presents the first and third situations. We highlight the response and used knowledge with the same color.
  • Figure 2: The supervised framework of SAFARI for personalized knowledge-grounded dialogues. We use different colors to indicate different steps. The black arrow denotes the flow of data without the the involvement of LLM.