Table of Contents
Fetching ...

Stake the Points: Structure-Faithful Instance Unlearning

Kiseong Hong, JungKyoo Shin, Eunwoo Kim

Abstract

Machine unlearning (MU) addresses privacy risks in pretrained models. The main goal of MU is to remove the influence of designated data while preserving the utility of retained knowledge. Achieving this goal requires preserving semantic relations among retained instances, which existing studies often overlook. We observe that without such preservation, models suffer from progressive structural collapse, undermining both the deletion-retention balance. In this work, we propose a novel structure-faithful framework that introduces stakes, i.e., semantic anchors that serve as reference points to maintain the knowledge structure. By leveraging these anchors, our framework captures and stabilizes the semantic organization of knowledge. Specifically, we instantiate the anchors from language-driven attribute descriptions encoded by a semantic encoder (e.g., CLIP). We enforce preservation of the knowledge structure via structure-aware alignment and regularization: the former aligns the organization of retained knowledge before and after unlearning around anchors, while the latter regulates updates to structure-critical parameters. Results from image classification, retrieval, and face recognition show average gains of 32.9%, 22.5%, and 19.3% in performance, balancing the deletion-retention trade-off and enhancing generalization.

Stake the Points: Structure-Faithful Instance Unlearning

Abstract

Machine unlearning (MU) addresses privacy risks in pretrained models. The main goal of MU is to remove the influence of designated data while preserving the utility of retained knowledge. Achieving this goal requires preserving semantic relations among retained instances, which existing studies often overlook. We observe that without such preservation, models suffer from progressive structural collapse, undermining both the deletion-retention balance. In this work, we propose a novel structure-faithful framework that introduces stakes, i.e., semantic anchors that serve as reference points to maintain the knowledge structure. By leveraging these anchors, our framework captures and stabilizes the semantic organization of knowledge. Specifically, we instantiate the anchors from language-driven attribute descriptions encoded by a semantic encoder (e.g., CLIP). We enforce preservation of the knowledge structure via structure-aware alignment and regularization: the former aligns the organization of retained knowledge before and after unlearning around anchors, while the latter regulates updates to structure-critical parameters. Results from image classification, retrieval, and face recognition show average gains of 32.9%, 22.5%, and 19.3% in performance, balancing the deletion-retention trade-off and enhancing generalization.
Paper Structure (17 sections, 4 equations, 9 figures, 10 tables)

This paper contains 17 sections, 4 equations, 9 figures, 10 tables.

Figures (9)

  • Figure 1: Conceptual illustration of structural collapse and our structure-faithful remedy in MU. Existing works delete the designated instance but ignore semantic relations between retained instances. During unlearning, model updates induce oscillations in the representation space. Without relational awareness, these oscillations distort the instance-level semantic relations (e.g., a monkey embedding drifting toward grape while moving away from banana), collapsing the original knowledge organization. We introduce semantic anchors (i.e., stakes) that preserve key relational constraints by keeping the relative positions between anchors and retained instances (i.e., anchor-to-instance relations).
  • Figure 2: Structural collapse and its impact on the deletion–retention balance on CIFAR-100 (256 designated instances). (a) We observe structural collapse during unlearning in prior MU studies (Neg golatkar2020eternal, Adv cha2024learning, and L2UL cha2024learning). We quantify it as the change in affinities between retained-instance embeddings and anchors before and during unlearning. Larger values indicate greater semantic shift. (b) We compare average structural collapse (mean over unlearning steps) with the trade-off accuracy defined as the retention–deletion accuracy gap. Each dot denotes a random trial; Neg lies along the x-axis, with accuracy below 5%.
  • Figure 3: An illustration of the proposed structure-faithful unlearning framework. The left side illustrates the unlearning process, where we aim to preserve the original structure, $S^{\text{ori}}$, defined by the affinities between visual embeddings $V$ and semantic anchors $A$, by ensuring that these affinities remain consistent in the unlearned structure $S^{\text{unl}}$. The right side shows the procedure for collecting class-wise anchors: a large language model generates attribute descriptions, which are embedded into anchor vectors via a frozen semantic encoder $T$.
  • Figure 4: Grad-CAM visualizations on Lacuna-10. Red boxes denote forget instances, and green boxes denote retention instances.
  • Figure 5: Qualitative and quantitative results of the image-to-image retrieval task on CIFAR-10. The left side shows top-5 retrieval examples given query images. Green-bordered queries correspond to retained samples, while red-bordered queries denote forgotten ones. For each query, retrieved images with correct matches are marked with check marks, and incorrect ones with crosses. The right bar plots report the quantitative retrieval performance in terms of R@1, R@5, R@10, and mAP, comparing our method with L2UL.
  • ...and 4 more figures