Table of Contents
Fetching ...

Agentic Feedback Loop Modeling Improves Recommendation and User Simulation

Shihao Cai, Jizhi Zhang, Keqin Bao, Chongming Gao, Qifan Wang, Fuli Feng, Xiangnan He

TL;DR

Agentic Feedback Loop (AFL) addresses the gap where prior work optimizes either the recommender or the user simulator in isolation by jointly training two memory-enabled agents through a text-based feedback loop. The framework employs a memory-augmented Recommendation Agent and a memory-augmented User Agent that exchange items, reasons, and user feedback to iteratively improve both item recommendations and user simulations. Across LastFM, Steam, and MovieLens, AFL yields average gains of $11.52%$ over a single recommendation agent and $21.12%$ over a single user agent, while mitigating popularity and position biases observed in real-world loops. The work offers a simple, generic protocol for cooperative multi-agent recommendation and user simulation, with open-source code, enabling practical integration and further research into mutually beneficial agent collaboration.

Abstract

Large language model-based agents are increasingly applied in the recommendation field due to their extensive knowledge and strong planning capabilities. While prior research has primarily focused on enhancing either the recommendation agent or the user agent individually, the collaborative interaction between the two has often been overlooked. Towards this research gap, we propose a novel framework that emphasizes the feedback loop process to facilitate the collaboration between the recommendation agent and the user agent. Specifically, the recommendation agent refines its understanding of user preferences by analyzing the feedback from the user agent on the item recommendation. Conversely, the user agent further identifies potential user interests based on the items and recommendation reasons provided by the recommendation agent. This iterative process enhances the ability of both agents to infer user behaviors, enabling more effective item recommendations and more accurate user simulations. Extensive experiments on three datasets demonstrate the effectiveness of the agentic feedback loop: the agentic feedback loop yields an average improvement of 11.52% over the single recommendation agent and 21.12% over the single user agent. Furthermore, the results show that the agentic feedback loop does not exacerbate popularity or position bias, which are typically amplified by the real-world feedback loop, highlighting its robustness. The source code is available at https://github.com/Lanyu0303/AFL.

Agentic Feedback Loop Modeling Improves Recommendation and User Simulation

TL;DR

Agentic Feedback Loop (AFL) addresses the gap where prior work optimizes either the recommender or the user simulator in isolation by jointly training two memory-enabled agents through a text-based feedback loop. The framework employs a memory-augmented Recommendation Agent and a memory-augmented User Agent that exchange items, reasons, and user feedback to iteratively improve both item recommendations and user simulations. Across LastFM, Steam, and MovieLens, AFL yields average gains of over a single recommendation agent and over a single user agent, while mitigating popularity and position biases observed in real-world loops. The work offers a simple, generic protocol for cooperative multi-agent recommendation and user simulation, with open-source code, enabling practical integration and further research into mutually beneficial agent collaboration.

Abstract

Large language model-based agents are increasingly applied in the recommendation field due to their extensive knowledge and strong planning capabilities. While prior research has primarily focused on enhancing either the recommendation agent or the user agent individually, the collaborative interaction between the two has often been overlooked. Towards this research gap, we propose a novel framework that emphasizes the feedback loop process to facilitate the collaboration between the recommendation agent and the user agent. Specifically, the recommendation agent refines its understanding of user preferences by analyzing the feedback from the user agent on the item recommendation. Conversely, the user agent further identifies potential user interests based on the items and recommendation reasons provided by the recommendation agent. This iterative process enhances the ability of both agents to infer user behaviors, enabling more effective item recommendations and more accurate user simulations. Extensive experiments on three datasets demonstrate the effectiveness of the agentic feedback loop: the agentic feedback loop yields an average improvement of 11.52% over the single recommendation agent and 21.12% over the single user agent. Furthermore, the results show that the agentic feedback loop does not exacerbate popularity or position bias, which are typically amplified by the real-world feedback loop, highlighting its robustness. The source code is available at https://github.com/Lanyu0303/AFL.

Paper Structure

This paper contains 27 sections, 6 figures, 7 tables, 1 algorithm.

Figures (6)

  • Figure 1: The comparison between the recommendation agent, the user agent, and the AFL. The recommendation agent typically recommends items based on user-item history, whereas the user agent generally simulates user behavior towards these items. AFL concurrently develops both a recommendation agent and a user agent, emphasizing the interaction and collaboration between two agents.
  • Figure 2: Overview of the proposed AFL method. The user-item history initializes the user agent and serves as input for the recommendation agent. The recommendation agent then suggests an item with a reason. If the user agent approves, the process ends. Otherwise, the user agent provides feedback to help refine future recommendations.
  • Figure 3: The example of interaction between the recommendation agent and the user agent. Given the user-item interaction history, the recommendation agent needs to identify the correct item from the list of candidate items.
  • Figure 4: (a) Recommendation performance with increased iterations. (b) User simulation performance with increased iterations. $1:k$ is set to $1:1$.
  • Figure 5: The distribution of the three popularity categories in the Lastfm dataset under different settings.
  • ...and 1 more figures