UCD: Unconditional Discriminator Promotes Nash Equilibrium in GANs
Mengfei Xia, Nan Xue, Jiapeng Zhu, Yujun Shen
TL;DR
The paper addresses instability and mode collapse in GANs by examining Nash equilibrium, revealing that conditional discrimination introduces shortcuts that impair learning. It proposes Unconditional Discriminator (UCD) with Config B, and enhances robustness via a DINO-inspired loss in Config C, providing a theoretical guarantee that convergence yields $p_g(\mathbf x|c)=q(\mathbf x|c)$ and a practical, plug-in method for improved synthesis. Empirically, UCD achieves substantial gains on ImageNet-64, including a 1.47 FID that surpasses StyleGAN-XL, with improved precision and recall and minimal computational overhead. The approach offers a new lens on discriminator design to stabilize adversarial training and bolster one-step generation, with potential extensions to text-conditioned and diffusion-based distillation settings.
Abstract
Adversarial training turns out to be the key to one-step generation, especially for Generative Adversarial Network (GAN) and diffusion model distillation. Yet in practice, GAN training hardly converges properly and struggles in mode collapse. In this work, we quantitatively analyze the extent of Nash equilibrium in GAN training, and conclude that redundant shortcuts by inputting condition in $D$ disables meaningful knowledge extraction. We thereby propose to employ an unconditional discriminator (UCD), in which $D$ is enforced to extract more comprehensive and robust features with no condition injection. In this way, $D$ is able to leverage better knowledge to supervise $G$, which promotes Nash equilibrium in GAN literature. Theoretical guarantee on compatibility with vanilla GAN theory indicates that UCD can be implemented in a plug-in manner. Extensive experiments confirm the significant performance improvements with high efficiency. For instance, we achieved \textbf{1.47 FID} on the ImageNet-64 dataset, surpassing StyleGAN-XL and several state-of-the-art one-step diffusion models. The code will be made publicly available.
