Table of Contents
Fetching ...

Split Unlearning

Guangsheng Yu, Yanna Jiang, Qin Wang, Xu Wang, Baihe Ma, Caijun Sun, Wei Ni, Ren Ping Liu

TL;DR

This work presents SplitWiper, a SISA-aligned framework for Split Learning that decouples client/server propagation with a one-way-one-off scheme, enabling independent unlearning even when clients are unavailable. It further introduces SplitWiper+, which uses label expansion and DP-based masking to protect client labels during learning and unlearning. Across diverse datasets and modalities, SplitWiper achieves complete forgetting of unlearned labels with no degradation to retained labels and dramatic reductions in overhead, while SplitWiper+ preserves high label privacy against server-side inferences. The approach addresses regulatory 'right to forget' constraints in distributed SL and offers practical, privacy-preserving unlearning with strong empirical guarantees.

Abstract

We introduce Split Unlearning, a novel machine unlearning technology designed for Split Learning (SL), enabling the first-ever implementation of Sharded, Isolated, Sliced, and Aggregated (SISA) unlearning in SL frameworks. Particularly, the tight coupling between clients and the server in existing SL frameworks results in frequent bidirectional data flows and iterative training across all clients, violating the "Isolated" principle and making them struggle to implement SISA for independent and efficient unlearning. To address this, we propose SplitWiper with a new one-way-one-off propagation scheme, which leverages the inherently "Sharded" structure of SL and decouples neural signal propagation between clients and the server, enabling effective SISA unlearning even in scenarios with absent clients. We further design SplitWiper+ to enhance client label privacy, which integrates differential privacy and label expansion strategy to defend the privacy of client labels against the server and other potential adversaries. Experiments across diverse data distributions and tasks demonstrate that SplitWiper achieves 0% accuracy for unlearned labels, and 8% better accuracy for retained labels than non-SISA unlearning in SL. Moreover, the one-way-one-off propagation maintains constant overhead, reducing computational and communication costs by 99%. SplitWiper+ preserves 90% of label privacy when sharing masked labels with the server.

Split Unlearning

TL;DR

This work presents SplitWiper, a SISA-aligned framework for Split Learning that decouples client/server propagation with a one-way-one-off scheme, enabling independent unlearning even when clients are unavailable. It further introduces SplitWiper+, which uses label expansion and DP-based masking to protect client labels during learning and unlearning. Across diverse datasets and modalities, SplitWiper achieves complete forgetting of unlearned labels with no degradation to retained labels and dramatic reductions in overhead, while SplitWiper+ preserves high label privacy against server-side inferences. The approach addresses regulatory 'right to forget' constraints in distributed SL and offers practical, privacy-preserving unlearning with strong empirical guarantees.

Abstract

We introduce Split Unlearning, a novel machine unlearning technology designed for Split Learning (SL), enabling the first-ever implementation of Sharded, Isolated, Sliced, and Aggregated (SISA) unlearning in SL frameworks. Particularly, the tight coupling between clients and the server in existing SL frameworks results in frequent bidirectional data flows and iterative training across all clients, violating the "Isolated" principle and making them struggle to implement SISA for independent and efficient unlearning. To address this, we propose SplitWiper with a new one-way-one-off propagation scheme, which leverages the inherently "Sharded" structure of SL and decouples neural signal propagation between clients and the server, enabling effective SISA unlearning even in scenarios with absent clients. We further design SplitWiper+ to enhance client label privacy, which integrates differential privacy and label expansion strategy to defend the privacy of client labels against the server and other potential adversaries. Experiments across diverse data distributions and tasks demonstrate that SplitWiper achieves 0% accuracy for unlearned labels, and 8% better accuracy for retained labels than non-SISA unlearning in SL. Moreover, the one-way-one-off propagation maintains constant overhead, reducing computational and communication costs by 99%. SplitWiper+ preserves 90% of label privacy when sharing masked labels with the server.
Paper Structure (30 sections, 1 theorem, 5 equations, 8 figures, 11 tables, 4 algorithms)

This paper contains 30 sections, 1 theorem, 5 equations, 8 figures, 11 tables, 4 algorithms.

Key Result

Theorem 1

Consider that $\mathbb{V}_{exp} = \{{V_{1}^*}, \cdots, {V_{E}^*}\}$ is expanded by $\mathbb{V} = \{{V}_{1}, \cdots, {V}_{Q}\}$ using a given method $\mathcal{G}_V$, where $\forall {V}_{q} \in \mathbb{V}$, $\gamma_q$ copies with $\varepsilon$-DP-based noise are generated as $\{ {V_{a_1}^*} , \cdots,

Figures (8)

  • Figure 1: Why we need SplitWiper/SplitWiper+? The "Isolated" principle of SISA invalidates existing SLs, hindering efficient and effective split unlearning. Our solution that features a novel one-way-one-off design and DP-based label extension strategy enables SISA unlearning and fulfills the fundamental requirements of SLs.
  • Figure 2: Architectural Design: In SplitWiper (Sec.\ref{['sec:sisa_design']}), each client $k$ trains its local model $\mathcal{F}_o^k$ on its dataset $D_o^k$, treating them as SISA shards without additional partitioning. After training, clients freeze the weights and send outputs with labels to the server for caching (one-off), while the server continues training $\mathcal{F}_o^s$ using stored values without returning to the clients (one-way). SplitWiper+ (Sec.\ref{['sec:label-protection']}) enhances privacy by using a label expansion strategy to convert real labels into masked ones, which clients then share with the server for label-protected training.
  • Figure 3: Label Expansion: Our proposed strategy preserves label quantity and semantics by expanding, shuffling, and anonymizing them, with intermediate values expanded via a DP mechanism.
  • Figure 4: Client-side time consumption across different SL frameworks.
  • Figure 5: Impacts of the expansion factor $\gamma$ and privacy budget $\varepsilon$ on SplitWiper and SplitWiper+ in terms of client-side training and transmission time consumption and test accuracy.
  • ...and 3 more figures

Theorems & Definitions (2)

  • Theorem 1
  • Proof : Theorem \ref{['theorem_DP-based label anonymization']}