Table of Contents
Fetching ...

Amulet: a Python Library for Assessing Interactions Among ML Defenses and Risks

Asim Waheed, Vasisht Duddu, Rui Zhang, Sebastian Szyller

TL;DR

Amulet introduces a risk-centric, modular Python library to systematically evaluate both intended and unintended interactions among ML defenses and risks across security, privacy, and fairness. By providing a comprehensive, consistent, extensible, and applicable framework of attacks, defenses, and metrics, Amulet enables end-to-end evaluation and cross-risk analysis. Empirical results on CelebA demonstrate baseline replayability of known attacks and reveal new unintended interactions, such as how adversarial training affects attribute inference and model extraction across datasets. The work lays a foundation for unified, scalable assessment of defense interactions and outlines future enhancements to cover multi-defense deployments and larger models.

Abstract

Machine learning (ML) models are susceptible to various risks to security, privacy, and fairness. Most defenses are designed to protect against each risk individually (intended interactions) but can inadvertently affect susceptibility to other unrelated risks (unintended interactions). We introduce Amulet, the first Python library for evaluating both intended and unintended interactions among ML defenses and risks. Amulet is comprehensive by including representative attacks, defenses, and metrics; extensible to new modules due to its modular design; consistent with a user-friendly API template for inputs and outputs; and applicable for evaluating novel interactions. By satisfying all four properties, Amulet offers a unified foundation for studying how defenses interact, enabling the first systematic evaluation of unintended interactions across multiple risks.

Amulet: a Python Library for Assessing Interactions Among ML Defenses and Risks

TL;DR

Amulet introduces a risk-centric, modular Python library to systematically evaluate both intended and unintended interactions among ML defenses and risks across security, privacy, and fairness. By providing a comprehensive, consistent, extensible, and applicable framework of attacks, defenses, and metrics, Amulet enables end-to-end evaluation and cross-risk analysis. Empirical results on CelebA demonstrate baseline replayability of known attacks and reveal new unintended interactions, such as how adversarial training affects attribute inference and model extraction across datasets. The work lays a foundation for unified, scalable assessment of defense interactions and outlines future enhancements to cover multi-defense deployments and larger models.

Abstract

Machine learning (ML) models are susceptible to various risks to security, privacy, and fairness. Most defenses are designed to protect against each risk individually (intended interactions) but can inadvertently affect susceptibility to other unrelated risks (unintended interactions). We introduce Amulet, the first Python library for evaluating both intended and unintended interactions among ML defenses and risks. Amulet is comprehensive by including representative attacks, defenses, and metrics; extensible to new modules due to its modular design; consistent with a user-friendly API template for inputs and outputs; and applicable for evaluating novel interactions. By satisfying all four properties, Amulet offers a unified foundation for studying how defenses interact, enabling the first systematic evaluation of unintended interactions across multiple risks.

Paper Structure

This paper contains 16 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Overview of Amulet's modular design: Top level modules are categorized into different properties (in blue): security, privacy, and fairness. For each property, we include different risks which violate them (in orange). Under each risk, we include attacks which exploit the risk (in red), defenses to protect against it (in green), and metrics to evaluate the susceptibility to it (in yellow).
  • Figure 2: Evasion, adversarial training, and model extraction using Amulet. Each attack or defense is instantiated as an object; attacks are run with attack(), while defenses use train_robust(), train_fair(), or train_private() depending on type.
  • Figure 3: Comparison of $Acc_{te}$ of $\mathcal{M}_{def}\xspace$ and $Fid$ between $\mathcal{M}_{def}\xspace$ and $\mathcal{M}_{stol}\xspace$ for the given percentage of outliers removed from $\mathcal{M}_{def}\xspace$. $Acc_{te}$ shows a clear downward trend, while fidelity is not affected. The colors differentiate the dataset, while the dotted lines represent $Acc_{te}$ and the solid lines represent $Fid$.
  • Figure 4: Comparison of $Acc_{te}$ and $Fid_{corr}$ for the given percentage of outliers removed from $\mathcal{M}_{def}\xspace$. Both measures follow a similar downward trend. The colors differentiate the dataset, while the dotted lines represent $Acc_{te}$ and the solid lines represent $Fid_{corr}$. Error bars omitted for readability.