Table of Contents
Fetching ...

ERASER: Machine Unlearning in MLaaS via an Inference Serving-Aware Approach

Yuke Hu, Jian Lou, Jiaqi Liu, Wangze Ni, Feng Lin, Zhan Qin, Kui Ren

TL;DR

This work addresses the privacy and latency challenges of machine unlearning in MLaaS under RTBF constraints. It introduces ERASER, an inference-serving-aware framework that employs shard-and-aggregate training (in the spirit of SISA) combined with a certified inference consistency mechanism to allow inference requests to be served without immediate unlearning execution whenever safe. It also provides seven practical variants, balancing inference latency, computational overhead, and privacy risk through three design options (contexts, unlearning timing, and uncertified-inference handling). Extensive experiments across four datasets and models show dramatic latency reductions (up to 99% in some settings) and overhead savings (up to 31%), validating ERASER’s effectiveness in real MLaaS deployments. The approach offers significant practical impact by enabling compliant, low-latency MLaaS with robust RTBF guarantees and configurable privacy-performance tradeoffs.

Abstract

Over the past years, Machine Learning-as-a-Service (MLaaS) has received a surging demand for supporting Machine Learning-driven services to offer revolutionized user experience across diverse application areas. MLaaS provides inference service with low inference latency based on an ML model trained using a dataset collected from numerous individual data owners. Recently, for the sake of data owners' privacy and to comply with the "right to be forgotten (RTBF)" as enacted by data protection legislation, many machine unlearning methods have been proposed to remove data owners' data from trained models upon their unlearning requests. However, despite their promising efficiency, almost all existing machine unlearning methods handle unlearning requests independently from inference requests, which unfortunately introduces a new security issue of inference service obsolescence and a privacy vulnerability of undesirable exposure for machine unlearning in MLaaS. In this paper, we propose the ERASER framework for machinE unleaRning in MLaAS via an inferencE seRving-aware approach. ERASER strategically choose appropriate unlearning execution timing to address the inference service obsolescence issue. A novel inference consistency certification mechanism is proposed to avoid the violation of RTBF principle caused by postponed unlearning executions, thereby mitigating the undesirable exposure vulnerability. ERASER offers three groups of design choices to allow for tailor-made variants that best suit the specific environments and preferences of various MLaaS systems. Extensive empirical evaluations across various settings confirm ERASER's effectiveness, e.g., it can effectively save up to 99% of inference latency and 31% of computation overhead over the inference-oblivion baseline.

ERASER: Machine Unlearning in MLaaS via an Inference Serving-Aware Approach

TL;DR

This work addresses the privacy and latency challenges of machine unlearning in MLaaS under RTBF constraints. It introduces ERASER, an inference-serving-aware framework that employs shard-and-aggregate training (in the spirit of SISA) combined with a certified inference consistency mechanism to allow inference requests to be served without immediate unlearning execution whenever safe. It also provides seven practical variants, balancing inference latency, computational overhead, and privacy risk through three design options (contexts, unlearning timing, and uncertified-inference handling). Extensive experiments across four datasets and models show dramatic latency reductions (up to 99% in some settings) and overhead savings (up to 31%), validating ERASER’s effectiveness in real MLaaS deployments. The approach offers significant practical impact by enabling compliant, low-latency MLaaS with robust RTBF guarantees and configurable privacy-performance tradeoffs.

Abstract

Over the past years, Machine Learning-as-a-Service (MLaaS) has received a surging demand for supporting Machine Learning-driven services to offer revolutionized user experience across diverse application areas. MLaaS provides inference service with low inference latency based on an ML model trained using a dataset collected from numerous individual data owners. Recently, for the sake of data owners' privacy and to comply with the "right to be forgotten (RTBF)" as enacted by data protection legislation, many machine unlearning methods have been proposed to remove data owners' data from trained models upon their unlearning requests. However, despite their promising efficiency, almost all existing machine unlearning methods handle unlearning requests independently from inference requests, which unfortunately introduces a new security issue of inference service obsolescence and a privacy vulnerability of undesirable exposure for machine unlearning in MLaaS. In this paper, we propose the ERASER framework for machinE unleaRning in MLaAS via an inferencE seRving-aware approach. ERASER strategically choose appropriate unlearning execution timing to address the inference service obsolescence issue. A novel inference consistency certification mechanism is proposed to avoid the violation of RTBF principle caused by postponed unlearning executions, thereby mitigating the undesirable exposure vulnerability. ERASER offers three groups of design choices to allow for tailor-made variants that best suit the specific environments and preferences of various MLaaS systems. Extensive empirical evaluations across various settings confirm ERASER's effectiveness, e.g., it can effectively save up to 99% of inference latency and 31% of computation overhead over the inference-oblivion baseline.
Paper Structure (34 sections, 4 theorems, 9 equations, 13 figures, 4 tables, 2 algorithms)

This paper contains 34 sections, 4 theorems, 9 equations, 13 figures, 4 tables, 2 algorithms.

Key Result

theorem 1

Suppose that the most recent unlearning-updated models are trained on $\boldsymbol{\mathscr{D}} ^{\tt O}$, the pending unlearning requests are $\tt{U}^t$, and the shards with pending unlearning requests are $\{k\in[K] {|} \boldsymbol{\mathscr{S}}_k^t \neq \boldsymbol{\mathscr{S}}_k^{\tt O}\}$. Given Then, ERASER has certified inference consistency for the inference request on $\mathbf{z}$, if th

Figures (13)

  • Figure 1: Scheme of MLaaS
  • Figure 2: Comparison between ERASER and naive strategies. Green represents inference without privacy risk, red indicates inference with privacy risk, and yellow signifies unlearning execution. IR: Inference Request; UR: Unlearning Request.
  • Figure 3: Illustration of three representative cases for the certified inference consistency mechanism.
  • Figure 4: Illustration of variants with double contexts of ERASER.
  • Figure 5: Cumulative Percentage of Waiting Time.
  • ...and 8 more figures

Theorems & Definitions (5)

  • definition 1: Inference Consistency
  • theorem 1: Certified Inference Consistency
  • theorem 2: Waiting Time of SISA
  • theorem 3: Waiting Time of DIMP
  • theorem 4