Prompt Tuning as User Inherent Profile Inference Machine
Yusheng Lu, Zhaocheng Du, Xiangyang Li, Pengyue Jia, Yejing Wang, Weiwen Liu, Yichao Wang, Huifeng Guo, Ruiming Tang, Zhenhua Dong, Yongrui Duan, Xiangyu Zhao
TL;DR
The paper tackles the problem of inferring latent user profiles for recommender systems using large language models, addressing twisted causality, textual noise, and modality gaps. It introduces UserIP-Tuning, which uses soft prompts and EM-guided latent-profile inference, followed by a quantization module that maps embeddings to lightweight collaborative IDs stored in a feature bank. The approach demonstrates superior performance over strong baselines, transfers across models, and delivers practical benefits in industrial-scale deployments, including online A/B validation. The work also emphasizes explainability of inferred profiles and shows robust improvements in both accuracy and efficiency, making it suitable for real-world recommender systems.
Abstract
Large Language Models (LLMs) have exhibited significant promise in recommender systems by empowering user profiles with their extensive world knowledge and superior reasoning capabilities. However, LLMs face challenges like unstable instruction compliance, modality gaps, and high inference latency, leading to textual noise and limiting their effectiveness in recommender systems. To address these challenges, we propose UserIP-Tuning, which uses prompt-tuning to infer user profiles. It integrates the causal relationship between user profiles and behavior sequences into LLMs' prompts. It employs Expectation Maximization (EM) to infer the embedded latent profile, minimizing textual noise by fixing the prompt template. Furthermore, a profile quantization codebook bridges the modality gap by categorizing profile embeddings into collaborative IDs pre-stored for online deployment. This improves time efficiency and reduces memory usage. Experiments show that UserIP-Tuning outperforms state-of-the-art recommendation algorithms. An industry application confirms its effectiveness, robustness, and transferability. The presented solution has been deployed in Huawei AppGallery's Explore page since May 2025, serving 2 million daily active users, delivering significant improvements in real-world recommendation scenarios. The code is publicly available for replication at https://github.com/Applied-Machine-Learning-Lab/UserIP-Tuning.
