Data Poisoning against Differentially-Private Learners: Attacks and Defenses

Yuzhe Ma; Xiaojin Zhu; Justin Hsu

Data Poisoning against Differentially-Private Learners: Attacks and Defenses

Yuzhe Ma, Xiaojin Zhu, Justin Hsu

TL;DR

This work studies data poisoning against differentially private learners, revealing provable resistance when an attacker can modify only $k$ items but increasing vulnerability as $k$ grows. It introduces two attack paradigms—DPV (SGD-based on DP victims) and SV (surrogate-victim)—and instantiates them for objective- and output-perturbed logistic and ridge learners, deriving explicit gradient expressions via KKT conditions. Theoretical results provide lower bounds on attack effectiveness under $\epsilon$-DP (and $(\epsilon,\delta)$-DP) and are complemented by extensive experiments showing attacks become stronger with larger $k$ and weaker privacy, with deep-DPV often the most effective. The findings emphasize that while DP affords provable resistance, practical poisoning remains feasible under sufficient data modification, motivating tighter bounds and stronger defense strategies.

Abstract

Data poisoning attacks aim to manipulate the model produced by a learning algorithm by adversarially modifying the training set. We consider differential privacy as a defensive measure against this type of attack. We show that such learners are resistant to data poisoning attacks when the adversary is only able to poison a small number of items. However, this protection degrades as the adversary poisons more data. To illustrate, we design attack algorithms targeting objective and output perturbation learners, two standard approaches to differentially-private machine learning. Experiments show that our methods are effective when the attacker is allowed to poison sufficiently many training items.

Data Poisoning against Differentially-Private Learners: Attacks and Defenses

TL;DR

This work studies data poisoning against differentially private learners, revealing provable resistance when an attacker can modify only

items but increasing vulnerability as

grows. It introduces two attack paradigms—DPV (SGD-based on DP victims) and SV (surrogate-victim)—and instantiates them for objective- and output-perturbed logistic and ridge learners, deriving explicit gradient expressions via KKT conditions. Theoretical results provide lower bounds on attack effectiveness under

-DP (and

-DP) and are complemented by extensive experiments showing attacks become stronger with larger

and weaker privacy, with deep-DPV often the most effective. The findings emphasize that while DP affords provable resistance, practical poisoning remains feasible under sufficient data modification, motivating tighter bounds and stronger defense strategies.

Data Poisoning against Differentially-Private Learners: Attacks and Defenses

TL;DR

Abstract

Data Poisoning against Differentially-Private Learners: Attacks and Defenses

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (19)