Don't Break the Boundary: Continual Unlearning for OOD Detection Based on Free Energy Repulsion

Ningkang Peng; Kun Shao; Jingyang Mao; Linjing Qian; Xiaoqian Peng; Xichen Yang; Yanhui Gu

Don't Break the Boundary: Continual Unlearning for OOD Detection Based on Free Energy Repulsion

Ningkang Peng, Kun Shao, Jingyang Mao, Linjing Qian, Xiaoqian Peng, Xichen Yang, Yanhui Gu

TL;DR

This work addresses the key problem of boundary-preserving unlearning for OOD detection, where traditional classification-focused unlearning distorts the ID manifold and degrades anomaly discrimination. It reframes forgetting as transforming the target class into an OOD-like state, and introduces TFER, a Push-Pull framework that uses a Total Free Energy Repulsion objective along with a Pull mechanism to anchor retained prototypes, all implemented with low-rank LoRA adapters for efficiency. Theoretical analysis demonstrates gradient stability via convex geometry, ensuring updates stay within the convex hull of retained gradients and avoiding abrupt manifold collapse. Empirical results on CIFAR-100 show TFER achieves strong Forgetting efficacy, high utility preservation, and robust OOD generalization, along with substantial efficiency gains and a scalable continual unlearning strategy based on modular orthogonality. Overall, the approach offers a practical path to privacy-compliant and correctable open-world systems that maintain reliable OOD detection capabilities.

Abstract

Deploying trustworthy AI in open-world environments faces a dual challenge: the necessity for robust Out-of-Distribution (OOD) detection to ensure system safety, and the demand for flexible machine unlearning to satisfy privacy compliance and model rectification. However, this objective encounters a fundamental geometric contradiction: current OOD detectors rely on a static and compact data manifold, whereas traditional classification-oriented unlearning methods disrupt this delicate structure, leading to a catastrophic loss of the model's capability to discriminate anomalies while erasing target classes. To resolve this dilemma, we first define the problem of boundary-preserving class unlearning and propose a pivotal conceptual shift: in the context of OOD detection, effective unlearning is mathematically equivalent to transforming the target class into OOD samples. Based on this, we propose the TFER (Total Free Energy Repulsion) framework. Inspired by the free energy principle, TFER constructs a novel Push-Pull game mechanism: it anchors retained classes within a low-energy ID manifold through a pull mechanism, while actively expelling forgotten classes to high-energy OOD regions using a free energy repulsion force. This approach is implemented via parameter-efficient fine-tuning, circumventing the prohibitive cost of full retraining. Extensive experiments demonstrate that TFER achieves precise unlearning while maximally preserving the model's discriminative performance on remaining classes and external OOD data. More importantly, our study reveals that the unique Push-Pull equilibrium of TFER endows the model with inherent structural stability, allowing it to effectively resist catastrophic forgetting without complex additional constraints, thereby demonstrating exceptional potential in continual unlearning tasks.

Don't Break the Boundary: Continual Unlearning for OOD Detection Based on Free Energy Repulsion

TL;DR

Abstract

Paper Structure (31 sections, 10 equations, 3 figures, 6 tables)

This paper contains 31 sections, 10 equations, 3 figures, 6 tables.

Introduction
Related Work
OOD Detection
Machine Unlearning
Methodology
Preliminaries: Prototype-based OOD Detection
The Push Mechanism: Energy Barrier Construction
Theoretical Analysis: Gradient Stability via Convex Geometry
Boundedness via Convex Hull Constraint.
Orthogonality and Entropy Maximization.
The Pull Mechanism: Manifold Anchoring
Optimization Strategy: Rank-Constrained Adaptation
Experiments
Experimental Setup
Datasets.
...and 16 more sections

Figures (3)

Figure 1: Visualizing Boundary-Preserving Unlearning: UMAP Projection of the Feature Manifold. This figure compares the structural changes in the feature space after class unlearning, showcasing the Retained ID classes (Blue), External OOD samples (Purple), and the Target Forgotten class (Red).
Figure 2: The framework decomposes the class unlearning problem into an adversarial process of Push Force and Pull Force in a high-dimensional hyperspherical embedding space, implemented via parameter-efficient LoRA modules. The Push Force ($\mathcal{L}_{\text{unlearn}}$), represented by the TFER loss, pushes Forget Samples away from all Retain Prototypes. The Pull Force ($\mathcal{L}_{\text{protect}}$), the protection loss, pulls Retain Samples towards their corresponding prototypes and optimizes the prototype structure via the prototype contrastive loss ($\mathcal{L}_{\text{proto-contra}}$) to ensure clear separation of retain classes.
Figure 3: Sensitivity analysis of TFER performance across different hyperparameters.(a) Impact of training epochs.(b) Impact of forget coefficient.

Don't Break the Boundary: Continual Unlearning for OOD Detection Based on Free Energy Repulsion

TL;DR

Abstract

Don't Break the Boundary: Continual Unlearning for OOD Detection Based on Free Energy Repulsion

Authors

TL;DR

Abstract

Table of Contents

Figures (3)