Table of Contents
Fetching ...

DUCK: Distance-based Unlearning via Centroid Kinematics

Marco Cotogni, Jacopo Bonato, Luigi Sabetta, Francesco Pelosin, Alessandro Nicolosi

TL;DR

DUCK introduces Distance-based Unlearning via Centroid Kinematics, a metric-learning–driven approach that moves forget-sample embeddings toward the closest incorrect centroid while preserving retain-set performance. By computing per-class centroids on the retain data and optimizing a two-loss objective that combines a forget-focused $\mathcal{L}_{FGT}$ with a retain-preserving $\mathcal{L}_{RET}$, DUCK achieves effective unlearning in both class-removal and homogeneous removal scenarios. The work further introduces Adaptive Unlearning Score (AUS) to quantify the trade-off between forgetting efficacy and overall accuracy, and provides extensive analyses of feature-space structure and interpretability via SHAP. Empirical results on CIFAR-10/100 and TinyImageNet show DUCK outperforms state-of-the-art baselines in CR and HR, with favorable AUS and substantial time efficiency, while maintaining competitive or superior test performance. The findings establish DUCK as a practical, model-agnostic solution for privacy-preserving unlearning with interpretable, mechanism-driven behavior in high-dimensional embedding spaces.

Abstract

Machine Unlearning is rising as a new field, driven by the pressing necessity of ensuring privacy in modern artificial intelligence models. This technique primarily aims to eradicate any residual influence of a specific subset of data from the knowledge acquired by a neural model during its training. This work introduces a novel unlearning algorithm, denoted as Distance-based Unlearning via Centroid Kinematics (DUCK), which employs metric learning to guide the removal of samples matching the nearest incorrect centroid in the embedding space. Evaluation of the algorithm's performance is conducted across various benchmark datasets in two distinct scenarios, class removal, and homogeneous sampling removal, obtaining state-of-the-art performance. We also introduce a novel metric, called Adaptive Unlearning Score (AUS), encompassing not only the efficacy of the unlearning process in forgetting target data but also quantifying the performance loss relative to the original model. Additionally, we conducted a thorough investigation of the unlearning mechanism in DUCK, examining its impact on the organization of the feature space and employing explainable AI techniques for deeper insights.

DUCK: Distance-based Unlearning via Centroid Kinematics

TL;DR

DUCK introduces Distance-based Unlearning via Centroid Kinematics, a metric-learning–driven approach that moves forget-sample embeddings toward the closest incorrect centroid while preserving retain-set performance. By computing per-class centroids on the retain data and optimizing a two-loss objective that combines a forget-focused with a retain-preserving , DUCK achieves effective unlearning in both class-removal and homogeneous removal scenarios. The work further introduces Adaptive Unlearning Score (AUS) to quantify the trade-off between forgetting efficacy and overall accuracy, and provides extensive analyses of feature-space structure and interpretability via SHAP. Empirical results on CIFAR-10/100 and TinyImageNet show DUCK outperforms state-of-the-art baselines in CR and HR, with favorable AUS and substantial time efficiency, while maintaining competitive or superior test performance. The findings establish DUCK as a practical, model-agnostic solution for privacy-preserving unlearning with interpretable, mechanism-driven behavior in high-dimensional embedding spaces.

Abstract

Machine Unlearning is rising as a new field, driven by the pressing necessity of ensuring privacy in modern artificial intelligence models. This technique primarily aims to eradicate any residual influence of a specific subset of data from the knowledge acquired by a neural model during its training. This work introduces a novel unlearning algorithm, denoted as Distance-based Unlearning via Centroid Kinematics (DUCK), which employs metric learning to guide the removal of samples matching the nearest incorrect centroid in the embedding space. Evaluation of the algorithm's performance is conducted across various benchmark datasets in two distinct scenarios, class removal, and homogeneous sampling removal, obtaining state-of-the-art performance. We also introduce a novel metric, called Adaptive Unlearning Score (AUS), encompassing not only the efficacy of the unlearning process in forgetting target data but also quantifying the performance loss relative to the original model. Additionally, we conducted a thorough investigation of the unlearning mechanism in DUCK, examining its impact on the organization of the feature space and employing explainable AI techniques for deeper insights.
Paper Structure (24 sections, 9 equations, 9 figures, 9 tables)

This paper contains 24 sections, 9 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: DUCK unlearning scheme. A) DUCK architecture $\Phi_{\theta} = \Gamma_{\theta} \circ \Psi_{\theta}$ during unlearnig phase. The weights of $\Gamma_{\theta}$ are updated through gradient descent using $\mathcal{L}_{RET}$ loss on $\mathcal{D}_r$. Importantly, weights of $\Psi_{\theta}$ are optimized using both the gradient obtained from $\Gamma_{\theta}$ and also from the closest-centroid matching applied on $\mathcal{D}_f$. B) Representation of closest-centroid matching procedure. For each sample $x_i \in \mathcal{D}_f$ its distance from incorrect classes centroids is computed and the closest one is selected. C) The distance between $x_i$ and the selected closest centroid is minimized through gradient descent.
  • Figure 2: Contour plot illustrating the Adaptive Unlearning Score (AUS) as a function of $\Delta$ and the difference between the original and unlearned model's test accuracies ($A^{Or}_t - A_t$).
  • Figure 3: DUCK accuracy performance on the $\mathcal{D}_r^t$ of CIFAR100 (orange) compared to the model finetuned on $\mathcal{D}_r$ (purple) as a function of the number of classes removed. The accuracy and std reported are computed over 10 different shuffles of the classes.
  • Figure 4: Average test accuracy over $\mathcal{D}^t_r$ and AUS as a function of the unlearning time in the CR scenario for CIFAR-100 and TinyImagenet. Accuracies and AUS are reported as mean across 10 runs where for each one a different target class was selected.
  • Figure 5: Analysis of the structure of the feature space. A-B) Visualization of $\mathcal{D}_r$, $\mathcal{D}_f$, $\mathcal{D}_r^t$ and $\mathcal{D}_f^t$ samples embeddings of CIFAR-10 dataset obtained using TSNEvan2008visualizing for CR scenario. Embeddings are represented for the original model (Or. model; A) and the unlearned model (Un. model; B). The first row is related to the train samples, and the second one to the test ones. Black dots represent the forget class that has to be removed (class 0 - plane, in this case). The colored dots represent samples belonging to different retain classes. Ideally, forget samples are scattered between all the remaining classes. C) Comparative analysis of feature cluster densities across different model states. Computation is performed across 10 different runs, each one with a different forget class. Median values are shown in red, the boxes show the quartiles of the dataset while the whiskers extend to show the rest of the distribution. Forget (Retain) clusters are represented in light green (white) for the Original, DUCK, and Retrained models. The y-axis represents the density on a logarithmic scale. * p$<$0.05, ** p$<$ 0.001 and *** p$<$ 0.0001, 2-sided Wilcoxon signed-rank test
  • ...and 4 more figures