Table of Contents
Fetching ...

Adversarial Robustness of VAEs across Intersectional Subgroups

Chethan Krishnamurthy Ramanaik, Arjun Roy, Eirini Ntoutsi

TL;DR

This study evaluates the robustness of VAEs against non-targeted adversarial attacks by optimizing minimal sample-specific perturbations to cause maximal damage across diverse demographic subgroups (combinations of age and gender).

Abstract

Despite advancements in Autoencoders (AEs) for tasks like dimensionality reduction, representation learning and data generation, they remain vulnerable to adversarial attacks. Variational Autoencoders (VAEs), with their probabilistic approach to disentangling latent spaces, show stronger resistance to such perturbations compared to deterministic AEs; however, their resilience against adversarial inputs is still a concern. This study evaluates the robustness of VAEs against non-targeted adversarial attacks by optimizing minimal sample-specific perturbations to cause maximal damage across diverse demographic subgroups (combinations of age and gender). We investigate two questions: whether there are robustness disparities among subgroups, and what factors contribute to these disparities, such as data scarcity and representation entanglement. Our findings reveal that robustness disparities exist but are not always correlated with the size of the subgroup. By using downstream gender and age classifiers and examining latent embeddings, we highlight the vulnerability of subgroups like older women, who are prone to misclassification due to adversarial perturbations pushing their representations toward those of other subgroups.

Adversarial Robustness of VAEs across Intersectional Subgroups

TL;DR

This study evaluates the robustness of VAEs against non-targeted adversarial attacks by optimizing minimal sample-specific perturbations to cause maximal damage across diverse demographic subgroups (combinations of age and gender).

Abstract

Despite advancements in Autoencoders (AEs) for tasks like dimensionality reduction, representation learning and data generation, they remain vulnerable to adversarial attacks. Variational Autoencoders (VAEs), with their probabilistic approach to disentangling latent spaces, show stronger resistance to such perturbations compared to deterministic AEs; however, their resilience against adversarial inputs is still a concern. This study evaluates the robustness of VAEs against non-targeted adversarial attacks by optimizing minimal sample-specific perturbations to cause maximal damage across diverse demographic subgroups (combinations of age and gender). We investigate two questions: whether there are robustness disparities among subgroups, and what factors contribute to these disparities, such as data scarcity and representation entanglement. Our findings reveal that robustness disparities exist but are not always correlated with the size of the subgroup. By using downstream gender and age classifiers and examining latent embeddings, we highlight the vulnerability of subgroups like older women, who are prone to misclassification due to adversarial perturbations pushing their representations toward those of other subgroups.
Paper Structure (24 sections, 6 equations, 9 figures, 2 tables)

This paper contains 24 sections, 6 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: An overview of the approach
  • Figure 2: Adversarial deviation vs original (unperturbed) reconstruction loss for different subgroup instances (subgroup denoted by symbol) and different $\beta$-VAEs (denoted by color). The lower the adversarial deviation the higher the robustness.
  • Figure 3: Distribution of adversarial deviations for Gender and Age along with the group cardinalities.
  • Figure 4: Distributions of adversarial robustness of vanilla VAE ($\beta=1$), and $\beta$-VAE on subgroups defined by age and gender.
  • Figure 5: Inputs and reconstructions for normal ($x$) and perturbed samples ($x+\delta$) from the groups young men (columns 1 & 2), young women (columns 3 & 4), old men (columns 5 & 6), and old women (columns 7 & 8) with highest adversarial deviation.
  • ...and 4 more figures