Table of Contents
Fetching ...

Distribution-Level Feature Distancing for Machine Unlearning: Towards a Better Trade-off Between Model Utility and Forgetting

Dasol Choi, Dongbin Na

TL;DR

This work tackles privacy-preserving machine unlearning by identifying correlation collapse as a key issue when forgetting data degrades task-related feature correlations. It introduces Distribution-Level Feature Distancing (DLFD), a three-component framework that uses optimal transport to distally separate the retain and forget data distributions in feature space, preserves task-relevant information via a dynamic classification loss, and adapts the forgetting process with a dynamic strategy. Across facial recognition benchmarks, DLFD achieves superior forgetting performance while maintaining competitive task accuracy, outperforming state-of-the-art methods and demonstrating robustness via ablation analyses. The approach provides a practical pathway to implement right-to-be-forgotten guarantees in real-world vision systems with minimal utility loss and reduced leakage risk.

Abstract

With the explosive growth of deep learning applications and increasing privacy concerns, the right to be forgotten has become a critical requirement in various AI industries. For example, given a facial recognition system, some individuals may wish to remove their personal data that might have been used in the training phase. Unfortunately, deep neural networks sometimes unexpectedly leak personal identities, making this removal challenging. While recent machine unlearning algorithms aim to enable models to forget specific data, we identify an unintended utility drop-correlation collapse-in which the essential correlations between image features and true labels weaken during the forgetting process. To address this challenge, we propose Distribution-Level Feature Distancing (DLFD), a novel method that efficiently forgets instances while preserving task-relevant feature correlations. Our method synthesizes data samples by optimizing the feature distribution to be distinctly different from that of forget samples, achieving effective results within a single training epoch. Through extensive experiments on facial recognition datasets, we demonstrate that our approach significantly outperforms state-of-the-art machine unlearning methods in both forgetting performance and model utility preservation.

Distribution-Level Feature Distancing for Machine Unlearning: Towards a Better Trade-off Between Model Utility and Forgetting

TL;DR

This work tackles privacy-preserving machine unlearning by identifying correlation collapse as a key issue when forgetting data degrades task-related feature correlations. It introduces Distribution-Level Feature Distancing (DLFD), a three-component framework that uses optimal transport to distally separate the retain and forget data distributions in feature space, preserves task-relevant information via a dynamic classification loss, and adapts the forgetting process with a dynamic strategy. Across facial recognition benchmarks, DLFD achieves superior forgetting performance while maintaining competitive task accuracy, outperforming state-of-the-art methods and demonstrating robustness via ablation analyses. The approach provides a practical pathway to implement right-to-be-forgotten guarantees in real-world vision systems with minimal utility loss and reduced leakage risk.

Abstract

With the explosive growth of deep learning applications and increasing privacy concerns, the right to be forgotten has become a critical requirement in various AI industries. For example, given a facial recognition system, some individuals may wish to remove their personal data that might have been used in the training phase. Unfortunately, deep neural networks sometimes unexpectedly leak personal identities, making this removal challenging. While recent machine unlearning algorithms aim to enable models to forget specific data, we identify an unintended utility drop-correlation collapse-in which the essential correlations between image features and true labels weaken during the forgetting process. To address this challenge, we propose Distribution-Level Feature Distancing (DLFD), a novel method that efficiently forgets instances while preserving task-relevant feature correlations. Our method synthesizes data samples by optimizing the feature distribution to be distinctly different from that of forget samples, achieving effective results within a single training epoch. Through extensive experiments on facial recognition datasets, we demonstrate that our approach significantly outperforms state-of-the-art machine unlearning methods in both forgetting performance and model utility preservation.
Paper Structure (21 sections, 7 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 21 sections, 7 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: The concept of correlation collapse. If following the misguided forgetting direction, the correlation between the task-related useful features and labels can weaken.
  • Figure 2: Feature representations from age classification model. (a) demonstrates clear class distinctions, with age groups well-separated in feature space. (b), derived using Negative Gradient method, shows clustered features with less distinction, illustrating correlation collapse.
  • Figure 3: The core method of DLFD-feature distribution optimization through optimal transport. This component generates a synthesized dataset by maximizing the distance between retain and forget data distributions in the feature space. When combined with other components (detailed in Algorithm\ref{['main_algorithm']}), DLFD achieves a balance between model utility and forgetting performance.
  • Figure 4: Feature representations from the age classification model trained with DLFD. The model preserves class separation similar to the original model (Figure \ref{['fig:cls_seperated_features']}(a)), retaining task-relevant features while mitigating correlation collapse.
  • Figure 5: The loss distributions for two baselines and ours. The orange space represents the loss distribution for unseen data, while the green represents the loss distribution for forget data. (a) illustrates loss distributions for Original model. (b) shows loss distributions for Retrained model. Finally (c) represents loss distributions for $\theta_{unlearned}$ fine-tuned on DLFD-optimized images.
  • ...and 2 more figures