Beyond Whole Dialogue Modeling: Contextual Disentanglement for Conversational Recommendation
Guojia An, Jie Zou, Jiwei Wei, Chaoning Zhang, Fuming Sun, Yang Yang
TL;DR
This work tackles the challenge of effectively modeling complex dialogue context in conversational recommender systems by disentangling focus information from background information in an unsupervised manner. It introduces DisenCRS, a dual contextual disentanglement framework that combines contrastive and counterfactual inference strategies with an adaptive prompt learning module to dynamically select prompts based on dialogue context. The approach yields state-of-the-art performance on both item recommendation and response generation across two public datasets (ReDial and INSPIRED), demonstrating that explicit disentanglement reduces noise and helps align recommendations and responses with user intent. The combination of knowledge graph and semantic representations, along with prompt-based downstream adaptation, offers practical benefits for real-world CRS deployments and motivates further exploration of large language models for enhanced disentanglement in dialogue contexts.
Abstract
Conversational recommender systems aim to provide personalized recommendations by analyzing and utilizing contextual information related to dialogue. However, existing methods typically model the dialogue context as a whole, neglecting the inherent complexity and entanglement within the dialogue. Specifically, a dialogue comprises both focus information and background information, which mutually influence each other. Current methods tend to model these two types of information mixedly, leading to misinterpretation of users' actual needs, thereby lowering the accuracy of recommendations. To address this issue, this paper proposes a novel model to introduce contextual disentanglement for improving conversational recommender systems, named DisenCRS. The proposed model DisenCRS employs a dual disentanglement framework, including self-supervised contrastive disentanglement and counterfactual inference disentanglement, to effectively distinguish focus information and background information from the dialogue context under unsupervised conditions. Moreover, we design an adaptive prompt learning module to automatically select the most suitable prompt based on the specific dialogue context, fully leveraging the power of large language models. Experimental results on two widely used public datasets demonstrate that DisenCRS significantly outperforms existing conversational recommendation models, achieving superior performance on both item recommendation and response generation tasks.
