ChainRec: An Agentic Recommender Learning to Route Tool Chains for Diverse and Evolving Interests
Fuchun Li, Qian Li, Xingyu Gao, Bocheng Pan, Yang Wu, Jun Zhang, Huan Yu, Jie Jiang, Jinsheng Xiao, Hailong Shi
TL;DR
ChainRec addresses the brittleness of fixed recommendation pipelines by introducing an agentic recommender that dynamically routes among a standardized library of evidence-gathering tools. A two-stage learning process (SFT to train the Planner, followed by Direct Preference Optimization to align tool routing) enables instance-specific, budget-bounded evidence collection. Tool construction from expert CoT traces yields a modular Tool Agent Library that decouples capability from policy, enabling robust planning across domains. Experiments on AgentRecBench across Amazon, Goodreads, and Yelp show consistent Avg HR@1/3/5 improvements, especially in cold-start and evolving-interest scenarios, demonstrating strong adaptability and practical impact for interactive recommendations.
Abstract
Large language models (LLMs) are increasingly integrated into recommender systems, motivating recent interest in agentic and reasoning-based recommendation. However, most existing approaches still rely on fixed workflows, applying the same reasoning procedure across diverse recommendation scenarios. In practice, user contexts vary substantially-for example, in cold-start settings or during interest shifts, so an agent should adaptively decide what evidence to gather next rather than following a scripted process. To address this, we propose ChainRec, an agentic recommender that uses a planner to dynamically select reasoning tools. ChainRec builds a standardized Tool Agent Library from expert trajectories. It then trains a planner using supervised fine-tuning and preference optimization to dynamically select tools, decide their order, and determine when to stop. Experiments on AgentRecBench across Amazon, Yelp, and Goodreads show that ChainRec consistently improves Avg HR@{1,3,5} over strong baselines, with especially notable gains in cold-start and evolving-interest scenarios. Ablation studies further validate the importance of tool standardization and preference-optimized planning.
