SoK: Unintended Interactions among Machine Learning Defenses and Risks
Vasisht Duddu, Sebastian Szyller, N. Asokan
TL;DR
The paper tackles the problem of unintended interactions between ML defenses and security, privacy, and fairness risks. It introduces a unified framework centered on overfitting and memorization as the core mediators, and maps how defenses and risks relate through controllable factors. It surveys existing literature, provides a guideline for conjecturing new interactions, and empirically validates two unexplored interactions, demonstrating practical implications for defense design and deployment. The work offers a structured pathway to anticipate cross-risk effects and minimize trade-offs in real-world ML systems.
Abstract
Machine learning (ML) models cannot neglect risks to security, privacy, and fairness. Several defenses have been proposed to mitigate such risks. When a defense is effective in mitigating one risk, it may correspond to increased or decreased susceptibility to other risks. Existing research lacks an effective framework to recognize and explain these unintended interactions. We present such a framework, based on the conjecture that overfitting and memorization underlie unintended interactions. We survey existing literature on unintended interactions, accommodating them within our framework. We use our framework to conjecture on two previously unexplored interactions, and empirically validate our conjectures.
