Robust VAEs via Generating Process of Noise Augmented Data

Hiroo Irobe; Wataru Aoki; Kimihiro Yamazaki; Yuhui Zhang; Takumi Nakagawa; Hiroki Waida; Yuichiro Wada; Takafumi Kanamori

Robust VAEs via Generating Process of Noise Augmented Data

Hiroo Irobe, Wataru Aoki, Kimihiro Yamazaki, Yuhui Zhang, Takumi Nakagawa, Hiroki Waida, Yuichiro Wada, Takafumi Kanamori

TL;DR

This paper addresses the vulnerability of variational auto-encoders (VAEs) to adversarial inputs by showing that naive noise augmentation can fail to improve robustness. It introduces Robust Augmented Variational auto-ENcoder (RAVEN), a method that regularizes the latent space divergence between original and noise-augmented data via a latent generating process and a novel variational lower bound. The proposed bound is closed-form and generalizes the standard VAE bound, enabling stable optimization and improved adversarial resilience on MNIST and Fashion-MNIST without sacrificing reconstruction quality. Overall, RAVEN offers a principled, efficient defense for VAEs with practical impact on robust representation learning.

Abstract

Advancing defensive mechanisms against adversarial attacks in generative models is a critical research topic in machine learning. Our study focuses on a specific type of generative models - Variational Auto-Encoders (VAEs). Contrary to common beliefs and existing literature which suggest that noise injection towards training data can make models more robust, our preliminary experiments revealed that naive usage of noise augmentation technique did not substantially improve VAE robustness. In fact, it even degraded the quality of learned representations, making VAEs more susceptible to adversarial perturbations. This paper introduces a novel framework that enhances robustness by regularizing the latent space divergence between original and noise-augmented data. Through incorporating a paired probabilistic prior into the standard variational lower bound, our method significantly boosts defense against adversarial attacks. Our empirical evaluations demonstrate that this approach, termed Robust Augmented Variational Auto-ENcoder (RAVEN), yields superior performance in resisting adversarial inputs on widely-recognized benchmark datasets.

Robust VAEs via Generating Process of Noise Augmented Data

TL;DR

Abstract

Paper Structure (22 sections, 9 theorems, 36 equations, 2 figures, 2 tables)

This paper contains 22 sections, 9 theorems, 36 equations, 2 figures, 2 tables.

Introduction
Related Work
Variational Auto-Encoder
Existing Methods for Building Robust VAE
Smooth Encoder
Proposed Method
Proposed Variational Lower-Bound
Proof Sketch for Theorem \ref{['thm: proposed variational bound']}
Numerical Experiments
Experiment Setup
Adversarial attacks
Experiment protocols
Results and Discussions
Conclusion and Future Work
Lemmas and Propositions
...and 7 more sections

Key Result

Proposition 1

The prior $p(\bar{\bm{z}})$ in Definition dfn:latent generating process has a form of where the first and second terms are defined by Eq.eq: multivariate normal distribution.

Figures (2)

Figure 1: t-SNE visualization of latent variables by trained VAE on test MNIST: (A) trained on original MNIST training data $\{\tilde{\bm{x}}_i\}$, (B) trained on both original and noisy data $\{\tilde{\bm{x}}_i\} \cup \{\bm{x}_i\}$, where $\bm{x}_i = \tilde{\bm{x}}_i + \bm{\epsilon}_i$. Colors denote class labels on the top-right.
Figure 2: Attack budget $\delta$ (horizontal axis) versus classification accuracy (vertical axis) on MNIST (a, b) and Fashion-MNIST (c, d) datasets. Our proposed method (blue) significantly outperforms baseline methods under severe adversarial attacks, and is on par with them as the adversarial signal vanishes. Shaded area indicates standard derivations over 5 runs.

Theorems & Definitions (18)

Definition 1
Proposition 1
Theorem 1
Remark 1
Remark 2
Proposition 2
Lemma 1
Lemma 2
Lemma 3
proof
...and 8 more

Robust VAEs via Generating Process of Noise Augmented Data

TL;DR

Abstract

Robust VAEs via Generating Process of Noise Augmented Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (18)