Reinforced Prompt Personalization for Recommendation with Large Language Models
Wenyu Mao, Jiancan Wu, Weijian Chen, Chongming Gao, Xiang Wang, Xiangnan He
TL;DR
The paper tackles the limitations of task-wise prompts in LLM-based recommendation by introducing Reinforced Prompt Personalization (RPP) and its enhanced version RPP+. Framed as a multi-agent RL problem under Centralized Training with Decentralized Execution, RPP personalizes four sentence-level prompt patterns (role-playing, history records, reasoning guidance, and output format) for individual users and concatenates them to guide a frozen LLM recommender, with RPP+ adding a dynamic refine step. The approach optimizes prompts through MARL to maximize ranking rewards like $r_t = \mathrm{NDCG@M}$ with $M=10$, and demonstrates strong improvements over traditional models, few-shot methods, and other prompt-based methods across MovieLens-1M, Games, and Lastfm, with robust generalization across LLaMa2-7B-chat, ChatGPT, and Alpaca. Extensive ablations, sensitivity analyses, case studies, and timing analyses support the effectiveness and practicality of instance-wise prompting for LLM-powered recommendations, highlighting its potential to tailor insights to diverse users while managing computational costs. Overall, the work advances prompt engineering by decomposing prompts into meaningful patterns and optimizing them via MARL to yield personalized, high-quality recommendations.
Abstract
Designing effective prompts can empower LLMs to understand user preferences and provide recommendations with intent comprehension and knowledge utilization capabilities. Nevertheless, recent studies predominantly concentrate on task-wise prompting, developing fixed prompt templates shared across all users in a given recommendation task (e.g., rating or ranking). Although convenient, task-wise prompting overlooks individual user differences, leading to inaccurate analysis of user interests. In this work, we introduce the concept of instance-wise prompting, aiming at personalizing discrete prompts for individual users. Toward this end, we propose Reinforced Prompt Personalization (RPP) to realize it automatically. To improve efficiency and quality, RPP personalizes prompts at the sentence level rather than searching in the vast vocabulary word-by-word. Specifically, RPP breaks down the prompt into four patterns, tailoring patterns based on multi-agent and combining them. Then the personalized prompts interact with LLMs (environment) iteratively, to boost LLMs' recommending performance (reward). In addition to RPP, to improve the scalability of action space, our proposal of RPP+ dynamically refines the selected actions with LLMs throughout the iterative process. Extensive experiments on various datasets demonstrate the superiority of RPP/RPP+ over traditional recommender models, few-shot methods, and other prompt-based methods, underscoring the significance of instance-wise prompting in LLMs for recommendation. Our code is available at https://github.com/maowenyu-11/RPP.
