Table of Contents
Fetching ...

Customized Retrieval-Augmented Generation with LLM for Debiasing Recommendation Unlearning

Haichao Zhang, Chong Zhang, Peiyu Hu, Shi Qiu, Jia Wang

TL;DR

This work addresses the right-to-be-forgotten challenge in recommender systems by proposing CRAGRU, a Retrieval-Augmented Generation framework that performs user-level unlearning without retraining the backbone. By decoupling retrieval, augmentation, and generation, and by applying three targeted retrieval strategies, CRAGRU precisely isolates forgotten user signals and regenerates recommendations via an LLM guided by filtered data and user context. Experiments on three public datasets show CRAGRU achieves near-retraining utility with significantly reduced unlearning time and mitigates bias propagation to non-target users, outperforming state-of-the-art unlearning baselines. The approach demonstrates the viability of RAG-based architectures for privacy-preserving, scalable recommendations in large-scale systems.

Abstract

Modern recommender systems face a critical challenge in complying with privacy regulations like the 'right to be forgotten': removing a user's data without disrupting recommendations for others. Traditional unlearning methods address this by partial model updates, but introduce propagation bias--where unlearning one user's data distorts recommendations for behaviorally similar users, degrading system accuracy. While retraining eliminates bias, it is computationally prohibitive for large-scale systems. To address this challenge, we propose CRAGRU, a novel framework leveraging Retrieval-Augmented Generation (RAG) for efficient, user-specific unlearning that mitigates bias while preserving recommendation quality. CRAGRU decouples unlearning into distinct retrieval and generation stages. In retrieval, we employ three tailored strategies designed to precisely isolate the target user's data influence, minimizing collateral impact on unrelated users and enhancing unlearning efficiency. Subsequently, the generation stage utilizes an LLM, augmented with user profiles integrated into prompts, to reconstruct accurate and personalized recommendations without needing to retrain the entire base model. Experiments on three public datasets demonstrate that CRAGRU effectively unlearns targeted user data, significantly mitigating unlearning bias by preventing adverse impacts on non-target users, while maintaining recommendation performance comparable to fully trained original models. Our work highlights the promise of RAG-based architectures for building robust and privacy-preserving recommender systems. The source code is available at: https://github.com/zhanghaichao520/LLM_rec_unlearning.

Customized Retrieval-Augmented Generation with LLM for Debiasing Recommendation Unlearning

TL;DR

This work addresses the right-to-be-forgotten challenge in recommender systems by proposing CRAGRU, a Retrieval-Augmented Generation framework that performs user-level unlearning without retraining the backbone. By decoupling retrieval, augmentation, and generation, and by applying three targeted retrieval strategies, CRAGRU precisely isolates forgotten user signals and regenerates recommendations via an LLM guided by filtered data and user context. Experiments on three public datasets show CRAGRU achieves near-retraining utility with significantly reduced unlearning time and mitigates bias propagation to non-target users, outperforming state-of-the-art unlearning baselines. The approach demonstrates the viability of RAG-based architectures for privacy-preserving, scalable recommendations in large-scale systems.

Abstract

Modern recommender systems face a critical challenge in complying with privacy regulations like the 'right to be forgotten': removing a user's data without disrupting recommendations for others. Traditional unlearning methods address this by partial model updates, but introduce propagation bias--where unlearning one user's data distorts recommendations for behaviorally similar users, degrading system accuracy. While retraining eliminates bias, it is computationally prohibitive for large-scale systems. To address this challenge, we propose CRAGRU, a novel framework leveraging Retrieval-Augmented Generation (RAG) for efficient, user-specific unlearning that mitigates bias while preserving recommendation quality. CRAGRU decouples unlearning into distinct retrieval and generation stages. In retrieval, we employ three tailored strategies designed to precisely isolate the target user's data influence, minimizing collateral impact on unrelated users and enhancing unlearning efficiency. Subsequently, the generation stage utilizes an LLM, augmented with user profiles integrated into prompts, to reconstruct accurate and personalized recommendations without needing to retrain the entire base model. Experiments on three public datasets demonstrate that CRAGRU effectively unlearns targeted user data, significantly mitigating unlearning bias by preventing adverse impacts on non-target users, while maintaining recommendation performance comparable to fully trained original models. Our work highlights the promise of RAG-based architectures for building robust and privacy-preserving recommender systems. The source code is available at: https://github.com/zhanghaichao520/LLM_rec_unlearning.

Paper Structure

This paper contains 29 sections, 11 equations, 4 figures, 3 tables, 1 algorithm.

Figures (4)

  • Figure 1: Traditional methods use a single shared recommendation model for all users, where unlearning one user's data alters global parameters, potentially degrading recommendations for others. In contrast, our method leverages Retrieval-Augmented Generation (RAG) with LLMs to perform efficient and precise user-level unlearning without affecting unrelated users.
  • Figure 2: The framework of CRAGRU. When the user submits an unlearning request, the retriever uses the request to fetch relevant information from the dataset and filters out the information that needs to be forgotten. The remaining information is then used to create an unlearning prompt. Finally, the LLM generates recommendation results based on the user's needs.
  • Figure 3: Comparison of the performance between the forget set and the remain set on ML-1M and Netflix datasets.
  • Figure 4: Comparison of the performance of different retrieval strategies for the CRAGRU model on the ML-1M and Netflix.