Large Language Models as Source Planner for Personalized Knowledge-grounded Dialogue
Hongru Wang, Minda Hu, Yang Deng, Rui Wang, Fei Mi, Weichao Wang, Yasheng Wang, Wai-Chung Kwan, Irwin King, Kam-Fai Wong
TL;DR
The paper tackles the challenge of open-domain, personalized dialogue by modeling dependencies across multiple knowledge sources. It introduces SAFARI, a planning-retrieval-assembling pipeline that uses LLMs to decide when and which sources to invoke and in what order, while decoupling grounding from response generation. To study source interactions, the authors construct KBP, a dataset capturing persona-knowledge dependencies and grounding labels. Empirical results show that SAFARI, especially in supervised settings, improves persona-consistency and knowledge grounding, with the analysis highlighting the importance of accurate planning and dependency handling. The approach offers a scalable framework for integrating diverse knowledge sources in dialogue systems and points to future work on mitigating error propagation and expanding source coverage.
Abstract
Open-domain dialogue system usually requires different sources of knowledge to generate more informative and evidential responses. However, existing knowledge-grounded dialogue systems either focus on a single knowledge source or overlook the dependency between multiple sources of knowledge, which may result in generating inconsistent or even paradoxical responses. To incorporate multiple knowledge sources and dependencies between them, we propose SAFARI, a novel framework that leverages the exceptional capabilities of large language models (LLMs) in planning, understanding, and incorporating under both supervised and unsupervised settings. Specifically, SAFARI decouples the knowledge grounding into multiple sources and response generation, which allows easy extension to various knowledge sources including the possibility of not using any sources. To study the problem, we construct a personalized knowledge-grounded dialogue dataset \textit{\textbf{K}nowledge \textbf{B}ehind \textbf{P}ersona}~(\textbf{KBP}), which is the first to consider the dependency between persona and implicit knowledge. Experimental results on the KBP dataset demonstrate that the SAFARI framework can effectively produce persona-consistent and knowledge-enhanced responses.
