Statistical Guarantees of Group-Invariant GANs
Ziyu Chen, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu
TL;DR
This work provides the first statistical guarantees for group-invariant GANs by introducing $\Sigma$-invariant generators and discriminators via symmetry-based symmetrization. It proves that learning a $\Sigma$-invariant target distribution $\mu$ on a compact domain in $\mathbb{R}^d$ achieves an expected Wasserstein-1 error of order $\left(|\Sigma|n\right)^{-1/d}$, and, when $\mu$ is supported on a $d^*$-dimensional manifold, of order $\left(|\Sigma|n\right)^{-1/d^*}$, effectively simulating $|\Sigma|n$ i.i.d. samples. The analysis decomposes the error into invariant-discriminator approximation, invariant-generator approximation, and statistical terms, showing that symmetry reduces sample complexity and discriminator lower-bounds, beyond what data augmentation can accomplish. Numerical experiments with $C_4$-symmetric Gaussian mixtures corroborate the theory, demonstrating superior performance of invariant GANs over non-invariant models and augmentation alone. The results open avenues for extending these guarantees to continuous groups and other symmetry-preserving generative models, highlighting the practical impact on data-efficient learning in symmetry-rich domains.
Abstract
This work presents the first statistical performance guarantees for group-invariant generative models. Many real data, such as images and molecules, are invariant to certain group symmetries, which can be taken advantage of to learn more efficiently as we rigorously demonstrate in this work. Here we specifically study generative adversarial networks (GANs), and quantify the gains when incorporating symmetries into the model. Group-invariant GANs are a type of GANs in which the generators and discriminators are hardwired with group symmetries. Empirical studies have shown that these networks are capable of learning group-invariant distributions with significantly improved data efficiency. In this study, we aim to rigorously quantify this improvement by analyzing the reduction in sample complexity and in the discriminator approximation error for group-invariant GANs. Our findings indicate that when learning group-invariant distributions, the number of samples required for group-invariant GANs decreases proportionally by a factor of the group size and the discriminator approximation error has a reduced lower bound. Importantly, the overall error reduction cannot be achieved merely through data augmentation on the training data. Numerical results substantiate our theory and highlight the stark contrast between learning with group-invariant GANs and using data augmentation. This work also sheds light on the study of other generative models with group symmetries, such as score-based generative models.
