Table of Contents
Fetching ...

SoK: Unintended Interactions among Machine Learning Defenses and Risks

Vasisht Duddu, Sebastian Szyller, N. Asokan

TL;DR

The paper tackles the problem of unintended interactions between ML defenses and security, privacy, and fairness risks. It introduces a unified framework centered on overfitting and memorization as the core mediators, and maps how defenses and risks relate through controllable factors. It surveys existing literature, provides a guideline for conjecturing new interactions, and empirically validates two unexplored interactions, demonstrating practical implications for defense design and deployment. The work offers a structured pathway to anticipate cross-risk effects and minimize trade-offs in real-world ML systems.

Abstract

Machine learning (ML) models cannot neglect risks to security, privacy, and fairness. Several defenses have been proposed to mitigate such risks. When a defense is effective in mitigating one risk, it may correspond to increased or decreased susceptibility to other risks. Existing research lacks an effective framework to recognize and explain these unintended interactions. We present such a framework, based on the conjecture that overfitting and memorization underlie unintended interactions. We survey existing literature on unintended interactions, accommodating them within our framework. We use our framework to conjecture on two previously unexplored interactions, and empirically validate our conjectures.

SoK: Unintended Interactions among Machine Learning Defenses and Risks

TL;DR

The paper tackles the problem of unintended interactions between ML defenses and security, privacy, and fairness risks. It introduces a unified framework centered on overfitting and memorization as the core mediators, and maps how defenses and risks relate through controllable factors. It surveys existing literature, provides a guideline for conjecturing new interactions, and empirically validates two unexplored interactions, demonstrating practical implications for defense design and deployment. The work offers a structured pathway to anticipate cross-risk effects and minimize trade-offs in real-world ML systems.

Abstract

Machine learning (ML) models cannot neglect risks to security, privacy, and fairness. Several defenses have been proposed to mitigate such risks. When a defense is effective in mitigating one risk, it may correspond to increased or decreased susceptibility to other risks. Existing research lacks an effective framework to recognize and explain these unintended interactions. We present such a framework, based on the conjecture that overfitting and memorization underlie unintended interactions. We survey existing literature on unintended interactions, accommodating them within our framework. We use our framework to conjecture on two previously unexplored interactions, and empirically validate our conjectures.
Paper Structure (26 sections, 4 figures, 9 tables)

This paper contains 26 sections, 4 figures, 9 tables.

Figures (4)

  • Figure 1: Relationship between overfitting and memorization: Average $\mathtt{mem}$ across all data records in $\mathcal{D}_{tr}\xspace$ and overfitting ($\mathcal{G}_{err}$) are at the bottom. Circles indicate $\mathcal{D}_{tr}\xspace$ data records, and crosses indicate $\mathcal{D}_{te}\xspace$ data records.
  • Figure 2: Relationship between overfitting and memorization: Average $\mathtt{mem}$ across all data records in $\mathcal{D}_{tr}\xspace$ and overfitting ($\mathcal{G}_{err}$) are at the bottom. Circles indicate $\mathcal{D}_{tr}\xspace$ data records, and crosses indicate $\mathcal{D}_{te}\xspace$ data records.
  • Figure 3: Distinguishability across subgroups (\ref{['obj2-subgroups']}) decreases for $f^{fair}_{\theta}\xspace$.
  • Figure 4: Nature of Interaction. Accuracy to differentiate explanations from model trained on $\mathcal{D}_{tr}\xspace$ with $\alpha_1$=$0.5$ and $\alpha_2$$\in$ {$0.1-0.9$}.