Table of Contents
Fetching ...

Agentic Unlearning: When LLM Agent Meets Machine Unlearning

Bin Wang, Fan Wang, Pingping Wang, Jinyu Cong, Yang Yu, Yilong Yin, Zhongyi Han, Benzheng Wei

TL;DR

Synchronized Backflow Unlearning is presented, a framework that unlearns jointly across parameter and memory pathways and reduces traces of targeted private information across both pathways with limited degradation on retained data.

Abstract

In this paper, we introduce \textbf{agentic unlearning} which removes specified information from both model parameters and persistent memory in agents with closed-loop interaction. Existing unlearning methods target parameters alone, leaving two critical gaps: (i) parameter-memory backflow, where retrieval reactivates parametric remnants or memory artifacts reintroduce sensitive content, and (ii) the absence of a unified strategy that covers both parameter and memory pathways. We present Synchronized Backflow Unlearning (SBU), a framework that unlearns jointly across parameter and memory pathways. The memory pathway performs dependency closure-based unlearning that prunes isolated entities while logically invalidating shared artifacts. The parameter pathway employs stochastic reference alignment to guide model outputs toward a high-entropy prior. These pathways are integrated via a synchronized dual-update protocol, forming a closed-loop mechanism where memory unlearning and parametric suppression reinforce each other to prevent cross-pathway recontamination. Experiments on medical QA benchmarks show that SBU reduces traces of targeted private information across both pathways with limited degradation on retained data.

Agentic Unlearning: When LLM Agent Meets Machine Unlearning

TL;DR

Synchronized Backflow Unlearning is presented, a framework that unlearns jointly across parameter and memory pathways and reduces traces of targeted private information across both pathways with limited degradation on retained data.

Abstract

In this paper, we introduce \textbf{agentic unlearning} which removes specified information from both model parameters and persistent memory in agents with closed-loop interaction. Existing unlearning methods target parameters alone, leaving two critical gaps: (i) parameter-memory backflow, where retrieval reactivates parametric remnants or memory artifacts reintroduce sensitive content, and (ii) the absence of a unified strategy that covers both parameter and memory pathways. We present Synchronized Backflow Unlearning (SBU), a framework that unlearns jointly across parameter and memory pathways. The memory pathway performs dependency closure-based unlearning that prunes isolated entities while logically invalidating shared artifacts. The parameter pathway employs stochastic reference alignment to guide model outputs toward a high-entropy prior. These pathways are integrated via a synchronized dual-update protocol, forming a closed-loop mechanism where memory unlearning and parametric suppression reinforce each other to prevent cross-pathway recontamination. Experiments on medical QA benchmarks show that SBU reduces traces of targeted private information across both pathways with limited degradation on retained data.
Paper Structure (16 sections, 4 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 16 sections, 4 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Traditional unlearning (left) targets model parameters ($\theta$) only. Agentic unlearning (right) must address both parameters and memory to prevent backflow recontamination (red arrows). Synchronized bidirectional forgetting is required.
  • Figure 2: Overview of the proposed Synchronized Backflow Unlearning (SBU) framework. The framework adopts a dual-pathway design integrating the Memory Unlearning pathway (retrieval-storage) with the Parameter Unlearning pathway (parameters).
  • Figure 3: Computational efficiency. (a) Runtime vs. forget set size for QF100 and QF1000. (b) GPU memory usage during training. Red diamonds indicate the mean; dashed line marks device capacity.
  • Figure 4: Privacy and memory analysis. (a) Memory embeddings before and after unlearning. (b) Privacy metric $|\Delta(\text{MIA\_score})|$ ($\times 10^{-5}$); lower is better.
  • Figure 5: Visualization of hyperparameter sensitivity, showing the interaction between $\lambda_F$ and $T$.
  • ...and 1 more figures