Multi-Agent Diverse Generative Adversarial Networks
Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri, Philip H. S. Torr, Puneet K. Dokania
TL;DR
MAD-GAN introduces a multi-generator, single-discriminator framework to mitigate mode collapse by making the discriminator identify which generator produced each fake sample, effectively enforcing diversity. The approach yields a mixture-model interpretation with a proven optimality condition and demonstrates superior mode coverage and sample diversity across synthetic benchmarks, image-to-image translation, diverse-class data, and unsupervised representation learning. Extensions such as MAD-GAN-Sim further promote diversity via similarity-based objectives, broadening applicability to high-dimensional generation tasks. Overall, the work provides both theoretical insights and strong empirical evidence for using multiple generators with a generator-identification discriminator to capture rich, multimodal data distributions.
Abstract
We propose MAD-GAN, an intuitive generalization to the Generative Adversarial Networks (GANs) and its conditional variants to address the well known problem of mode collapse. First, MAD-GAN is a multi-agent GAN architecture incorporating multiple generators and one discriminator. Second, to enforce that different generators capture diverse high probability modes, the discriminator of MAD-GAN is designed such that along with finding the real and fake samples, it is also required to identify the generator that generated the given fake sample. Intuitively, to succeed in this task, the discriminator must learn to push different generators towards different identifiable modes. We perform extensive experiments on synthetic and real datasets and compare MAD-GAN with different variants of GAN. We show high quality diverse sample generations for challenging tasks such as image-to-image translation and face generation. In addition, we also show that MAD-GAN is able to disentangle different modalities when trained using highly challenging diverse-class dataset (e.g. dataset with images of forests, icebergs, and bedrooms). In the end, we show its efficacy on the unsupervised feature representation task. In Appendix, we introduce a similarity based competing objective (MAD-GAN-Sim) which encourages different generators to generate diverse samples based on a user defined similarity metric. We show its performance on the image-to-image translation, and also show its effectiveness on the unsupervised feature representation task.
