Table of Contents
Fetching ...

Scalability of memorization-based machine unlearning

Kairan Zhao, Peter Triantafillou

TL;DR

This work analyzes the profiles of various proxies and evaluates the performance of state-of-the-art (memorization-based) MUL algorithms in terms of both accuracy and privacy preservation, and shows that these proxies can introduce accuracy on par with full memorization-based unlearning while dramatically improving scalability.

Abstract

Machine unlearning (MUL) focuses on removing the influence of specific subsets of data (such as noisy, poisoned, or privacy-sensitive data) from pretrained models. MUL methods typically rely on specialized forms of fine-tuning. Recent research has shown that data memorization is a key characteristic defining the difficulty of MUL. As a result, novel memorization-based unlearning methods have been developed, demonstrating exceptional performance with respect to unlearning quality, while maintaining high performance for model utility. Alas, these methods depend on knowing the memorization scores of data points and computing said scores is a notoriously time-consuming process. This in turn severely limits the scalability of these solutions and their practical impact for real-world applications. In this work, we tackle these scalability challenges of state-of-the-art memorization-based MUL algorithms using a series of memorization-score proxies. We first analyze the profiles of various proxies and then evaluate the performance of state-of-the-art (memorization-based) MUL algorithms in terms of both accuracy and privacy preservation. Our empirical results show that these proxies can introduce accuracy on par with full memorization-based unlearning while dramatically improving scalability. We view this work as an important step toward scalable and efficient machine unlearning.

Scalability of memorization-based machine unlearning

TL;DR

This work analyzes the profiles of various proxies and evaluates the performance of state-of-the-art (memorization-based) MUL algorithms in terms of both accuracy and privacy preservation, and shows that these proxies can introduce accuracy on par with full memorization-based unlearning while dramatically improving scalability.

Abstract

Machine unlearning (MUL) focuses on removing the influence of specific subsets of data (such as noisy, poisoned, or privacy-sensitive data) from pretrained models. MUL methods typically rely on specialized forms of fine-tuning. Recent research has shown that data memorization is a key characteristic defining the difficulty of MUL. As a result, novel memorization-based unlearning methods have been developed, demonstrating exceptional performance with respect to unlearning quality, while maintaining high performance for model utility. Alas, these methods depend on knowing the memorization scores of data points and computing said scores is a notoriously time-consuming process. This in turn severely limits the scalability of these solutions and their practical impact for real-world applications. In this work, we tackle these scalability challenges of state-of-the-art memorization-based MUL algorithms using a series of memorization-score proxies. We first analyze the profiles of various proxies and then evaluate the performance of state-of-the-art (memorization-based) MUL algorithms in terms of both accuracy and privacy preservation. Our empirical results show that these proxies can introduce accuracy on par with full memorization-based unlearning while dramatically improving scalability. We view this work as an important step toward scalable and efficient machine unlearning.

Paper Structure

This paper contains 25 sections, 3 equations, 15 figures, 7 tables.

Figures (15)

  • Figure 1: Overview of RUM.
  • Figure 2: Uncovering the impact of three proxies (confidence, binary accuracy, holdout retraining) and memorization on unlearning performance in RUM$^\mathcal{F}$, evaluated using ToW (Figures (a),(b),(c)) and ToW-MIA (Figures (d),(e),(f)) across three datasets and model architectures. Higher ToW/ToW-MIA values indicate better performance.
  • Figure 3: Cumulative performance changes over 5-step sequential unlearning in RUM$^\mathcal{F}$ and vanilla using NegGrad+ as the baseline, evaluated by ToW (Figures (a), (c)) and ToW-MIA (Figures (b), (d)) across two datasets and model architectures. Higher ToW/ToW-MIA values indicate better performance.
  • Figure 7: Distribution of proxy values before and after each unlearning step, using holdout retraining as the proxy and NegGrad+ as the unlearning baseline with the vanilla approach, evaluated on Tiny-ImageNet with VGG-16 model architecture.
  • Figure : CIFAR-10 with ResNet-18
  • ...and 10 more figures