BI-DCGAN: A Theoretically Grounded Bayesian Framework for Efficient and Diverse GANs
Mahsa Valizadeh, Rui Tuo, James Caverlee
TL;DR
This work tackles mode collapse and a lack of uncertainty modeling in GANs by introducing BI-DCGAN, a Bayesian extension of DCGAN that learns a distribution over weights via Bayes by Backprop and MFVI. It provides a covariance-based theoretical proof showing that weight uncertainty expands sample diversity, and validates this claim with extensive experiments across MNIST, CIFAR-10, Fashion-MNIST, and SVHN, demonstrating more diverse outputs while preserving training efficiency. Empirically, BI-DCGAN improves diversity as evidenced by larger covariance eigenvalues and enhances generalization when synthetic data augmentations are used. The approach offers a scalable, uncertainty-aware alternative to diffusion models, suitable for resource-constrained applications requiring both diversity and reliability in generated samples.
Abstract
Generative Adversarial Networks (GANs) are proficient at generating synthetic data but continue to suffer from mode collapse, where the generator produces a narrow range of outputs that fool the discriminator but fail to capture the full data distribution. This limitation is particularly problematic, as generative models are increasingly deployed in real-world applications that demand both diversity and uncertainty awareness. In response, we introduce BI-DCGAN, a Bayesian extension of DCGAN that incorporates model uncertainty into the generative process while maintaining computational efficiency. BI-DCGAN integrates Bayes by Backprop to learn a distribution over network weights and employs mean-field variational inference to efficiently approximate the posterior distribution during GAN training. We establishes the first theoretical proof, based on covariance matrix analysis, that Bayesian modeling enhances sample diversity in GANs. We validate this theoretical result through extensive experiments on standard generative benchmarks, demonstrating that BI-DCGAN produces more diverse and robust outputs than conventional DCGANs, while maintaining training efficiency. These findings position BI-DCGAN as a scalable and timely solution for applications where both diversity and uncertainty are critical, and where modern alternatives like diffusion models remain too resource-intensive.
