AdaGAN: Boosting Generative Models
Ilya Tolstikhin, Sylvain Gelly, Olivier Bousquet, Carl-Johann Simon-Gabriel, Bernhard Schölkopf
TL;DR
AdaGAN introduces a boosting-style meta-algorithm that builds a strong mixture of generative models by iteratively reweighting data to focus on uncovered regions, addressing the missing modes problem in GANs.The paper develops a general f-divergence framework for additive mixtures, derives optimal update components, and proves exponential or finite convergence under various conditions, even with approximate learners.The approach is instantiated for GANs, including practical methods to compute the necessary weighting factors, and is validated with toy and MNIST experiments showing improved mode coverage and reduced variance compared to baselines.Overall, AdaGAN offers a theoretically grounded, practically effective strategy to construct diverse generative models via successive reweighted training of components.
Abstract
Generative Adversarial Networks (GAN) (Goodfellow et al., 2014) are an effective method for training generative models of complex data such as natural images. However, they are notoriously hard to train and can suffer from the problem of missing modes where the model is not able to produce examples in certain regions of the space. We propose an iterative procedure, called AdaGAN, where at every step we add a new component into a mixture model by running a GAN algorithm on a reweighted sample. This is inspired by boosting algorithms, where many potentially weak individual predictors are greedily aggregated to form a strong composite predictor. We prove that such an incremental procedure leads to convergence to the true distribution in a finite number of steps if each step is optimal, and convergence at an exponential rate otherwise. We also illustrate experimentally that this procedure addresses the problem of missing modes.
