Diagnostic-Guided Dynamic Profile Optimization for LLM-based User Simulators in Sequential Recommendation
Hongyang Liu, Zhu Sun, Tianjun Wei, Yan Wang, Jiajie Zhu, Xinghua Qu
TL;DR
This work tackles the fidelity gap in LLM-based user simulators for sequential recommender systems by introducing Diagnostic-Guided Dynamic Profile Optimization (DGDPO). DGDPO uses a specialized diagnostic module, trained via domain-adaptive pre-training and defect-specific fine-tuning, to detect profile defects, and a generalized treatment module to generate targeted refinements, iteratively traversing user history to dynamically optimize profiles. By integrating with sequential recommenders, DGDPO enables realistic multi-round interactions where user profiles and recommendation strategies co-evolve. Experiments on three real-world datasets show substantial gains in simulation fidelity, with ablation and parameter studies underscoring the importance of iterative optimization, specialized diagnostics, and carefully tuned training settings for the diagnostic model.
Abstract
Recent advances in large language models (LLMs) have enabled realistic user simulators for developing and evaluating recommender systems (RSs). However, existing LLM-based simulators for RSs face two major limitations: (1) static and single-step prompt-based inference that leads to inaccurate and incomplete user profile construction; (2) unrealistic and single-round recommendation-feedback interaction pattern that fails to capture real-world scenarios. To address these limitations, we propose DGDPO (Diagnostic-Guided Dynamic Profile Optimization), a novel framework that constructs user profile through a dynamic and iterative optimization process to enhance the simulation fidelity. Specifically, DGDPO incorporates two core modules within each optimization loop: firstly, a specialized LLM-based diagnostic module, calibrated through our novel training strategy, accurately identifies specific defects in the user profile. Subsequently, a generalized LLM-based treatment module analyzes the diagnosed defect and generates targeted suggestions to refine the profile. Furthermore, unlike existing LLM-based user simulators that are limited to single-round interactions, we are the first to integrate DGDPO with sequential recommenders, enabling a bidirectional evolution where user profiles and recommendation strategies adapt to each other over multi-round interactions. Extensive experiments conducted on three real-world datasets demonstrate the effectiveness of our proposed framework.
