Table of Contents
Fetching ...

Unlearning Information Bottleneck: Machine Unlearning of Systematic Patterns and Biases

Ling Han, Hao Huang, Dustin Scheinost, Mary-Anne Hartley, María Rodríguez Martínez

TL;DR

Unlearning Information Bottleneck (UIB) extends the Information Bottleneck to machine unlearning by formulating a parameter-space objective that removes ΔD while preserving predictive power, expressed as $UIB_{\beta}(D\setminus\Delta D; \theta^{(L)}) = -I(Y_{D\setminus\Delta D}; \theta^{(L)}) + \beta I(X_{D\setminus\Delta D}; \theta^{(L)})$. It introduces a hierarchical, locally dependent representation $\{\theta^{(l)}\}$ and variational bounds to make optimization tractable, culminating in the UIB-IF algorithm and a general pathway to integrate other unlearning methods via the regularization term $\mathcal{R}^{(l)}$. The framework is validated on MNIST, MNIST-C, CIFAR-10, and CIFAR-100 with ResNet-18, showing superior unlearning efficacy (reduced bias correlations and improved MIA-Efficacy) while maintaining high Testing F1 scores and modest unlearning time cost. These results demonstrate that leveraging systematic patterns and adaptive priors can robustly mitigate biases and outdated information under non-random data removals, enabling privacy-preserving, bias-aware continual learning in dynamic environments.

Abstract

Effective adaptation to distribution shifts in training data is pivotal for sustaining robustness in neural networks, especially when removing specific biases or outdated information, a process known as machine unlearning. Traditional approaches typically assume that data variations are random, which makes it difficult to adjust the model parameters accurately to remove patterns and characteristics from unlearned data. In this work, we present Unlearning Information Bottleneck (UIB), a novel information-theoretic framework designed to enhance the process of machine unlearning that effectively leverages the influence of systematic patterns and biases for parameter adjustment. By proposing a variational upper bound, we recalibrate the model parameters through a dynamic prior that integrates changes in data distribution with an affordable computational cost, allowing efficient and accurate removal of outdated or unwanted data patterns and biases. Our experiments across various datasets, models, and unlearning methods demonstrate that our approach effectively removes systematic patterns and biases while maintaining the performance of models post-unlearning.

Unlearning Information Bottleneck: Machine Unlearning of Systematic Patterns and Biases

TL;DR

Unlearning Information Bottleneck (UIB) extends the Information Bottleneck to machine unlearning by formulating a parameter-space objective that removes ΔD while preserving predictive power, expressed as . It introduces a hierarchical, locally dependent representation and variational bounds to make optimization tractable, culminating in the UIB-IF algorithm and a general pathway to integrate other unlearning methods via the regularization term . The framework is validated on MNIST, MNIST-C, CIFAR-10, and CIFAR-100 with ResNet-18, showing superior unlearning efficacy (reduced bias correlations and improved MIA-Efficacy) while maintaining high Testing F1 scores and modest unlearning time cost. These results demonstrate that leveraging systematic patterns and adaptive priors can robustly mitigate biases and outdated information under non-random data removals, enabling privacy-preserving, bias-aware continual learning in dynamic environments.

Abstract

Effective adaptation to distribution shifts in training data is pivotal for sustaining robustness in neural networks, especially when removing specific biases or outdated information, a process known as machine unlearning. Traditional approaches typically assume that data variations are random, which makes it difficult to adjust the model parameters accurately to remove patterns and characteristics from unlearned data. In this work, we present Unlearning Information Bottleneck (UIB), a novel information-theoretic framework designed to enhance the process of machine unlearning that effectively leverages the influence of systematic patterns and biases for parameter adjustment. By proposing a variational upper bound, we recalibrate the model parameters through a dynamic prior that integrates changes in data distribution with an affordable computational cost, allowing efficient and accurate removal of outdated or unwanted data patterns and biases. Our experiments across various datasets, models, and unlearning methods demonstrate that our approach effectively removes systematic patterns and biases while maintaining the performance of models post-unlearning.
Paper Structure (32 sections, 25 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 32 sections, 25 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: The Unlearning Information Bottleneck (UIB) aims to optimize the parameter-space representation $\theta$. This process involves removing sufficient information $\Delta D$ that represents systematic patterns and biases from the original training dataset $D$, while simultaneously preserving the model's prediction capabilities for the target $Y$. The UIB framework ensures that $\theta$ avoids containing irrelevant information that could lead to overfitting, privacy breaches, or sensitivity to changes in model hyperparameters. Mutual information is denoted by I(·; ·), guiding the optimization to balance informativeness against redundancy.
  • Figure 2: Experiment Settings. The background colors are fixed for each number in the training set. The unlearning request is to forget all the biases (colors) the model learned from the training set. The correlation of bias and prediction and performance is tested on a test set of uniform and randomly matched colors and numbers.
  • Figure 3: Post-unlearning Accuracy of UIB. We compared Exact Unlearning, Approximate Unlearning and UIB's test F1 Score in System Patterns and Random Data Points Scenarios, respectively.
  • Figure 4: MIA-Efficacy in Systematic Patterns Unlearning. We compared FT, SR, and IF methods in standard, sparsity, and UIB settings, respectively. The red line represents the baseline performance, lower is better.
  • Figure 5: Correlation of Unlearned Biases and Prediction. After each iteration of unlearning bias, the model was used to complete the test set's image classification task and calculate the correlation between image color and numeric labels.
  • ...and 1 more figures