Persona-Aware Alignment Framework for Personalized Dialogue Generation
Guanrong Li, Xinyu Liu, Zhen Wu, Xinyu Dai
TL;DR
This paper tackles persona-consistent dialogue generation by addressing the inadequacy of token-level training to capture user personas. It introduces the Persona-Aware Alignment Framework (PAL), a two-stage training scheme consisting of Persona-aware Learning and Persona Alignment, complemented by a Select-then-Generate inference strategy to improve semantic persona alignment. The approach jointly learns which persona is relevant and how to generate persona-aware responses, then directly optimizes alignment with given personas using Direct Preference Optimization (DPO) on constructed golden/Generated pairs. Across English and Chinese datasets and multiple foundation models, PAL yields significant gains over state-of-the-art baselines and even several closed-source LLMs, demonstrating strong generalizability and practical impact for personalized dialogue systems.
Abstract
Personalized dialogue generation aims to leverage persona profiles and dialogue history to generate persona-relevant and consistent responses. Mainstream models typically rely on token-level language model training with persona dialogue data, such as Next Token Prediction, to implicitly achieve personalization, making these methods tend to neglect the given personas and generate generic responses. To address this issue, we propose a novel Persona-Aware Alignment Framework (PAL), which directly treats persona alignment as the training objective of dialogue generation. Specifically, PAL employs a two-stage training method including Persona-aware Learning and Persona Alignment, equipped with an easy-to-use inference strategy Select then Generate, to improve persona sensitivity and generate more persona-relevant responses at the semantics level. Through extensive experiments, we demonstrate that our framework outperforms many state-of-the-art personalized dialogue methods and large language models.
