Support is All You Need for Certified VAE Training
Changming Xu, Debangshu Banerjee, Deepak Vasisht, Gagandeep Singh
TL;DR
CIVET introduces a principled framework for certifiably robust training of Variational Autoencoders by reducing the challenge of bounding the worst-case loss over a distributional latent space to bounding a deterministic decoder over a latent-space support set. The method selects a latent subset from encoder outputs that captures most of the mass with respect to a target probability, and computes a differentiable upper bound on the decoder’s worst-case loss over that subset, enabling gradient-based optimization without imposing Lipschitz or fixed-variance constraints. By combining bounds over multiple such supports with a probabilistic weighting scheme, CIVET delivers improved certified robustness across wireless and vision tasks while maintaining competitive standard accuracy, outperforming Lipschitz-constrained VAEs and several baselines. The approach opens a path toward broader certified robustness for stochastic neural networks and complex generative models, with practical implications for safe deployment in safety-critical systems.
Abstract
Variational Autoencoders (VAEs) have become increasingly popular and deployed in safety-critical applications. In such applications, we want to give certified probabilistic guarantees on performance under adversarial attacks. We propose a novel method, CIVET, for certified training of VAEs. CIVET depends on the key insight that we can bound worst-case VAE error by bounding the error on carefully chosen support sets at the latent layer. We show this point mathematically and present a novel training algorithm utilizing this insight. We show in an extensive evaluation across different datasets (in both the wireless and vision application areas), architectures, and perturbation magnitudes that our method outperforms SOTA methods achieving good standard performance with strong robustness guarantees.
