SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer
Yuhta Takida, Masaaki Imaizumi, Takashi Shibuya, Chieh-Hsin Lai, Toshimitsu Uesaka, Naoki Murata, Yuki Mitsufuji
TL;DR
This work addresses whether GAN optimization yields gradients that truly reduce the distance to a target distribution. It develops a theoretical framework—Functional Mean Divergence ${FM}^*$ and max-Augmented Sliced Wasserstein (max-ASW)—to characterize when a discriminator can metrize the distance between distributions, even beyond optimal-discriminator assumptions. The authors prove metrizable conditions (direction optimality, separability, injectivity) and introduce Slicing Adversarial Network (SAN), which enforces these conditions via two minimal modifications to the discriminator and a new maximization objective. Empirically, SAN improves mode coverage and sample quality across synthetic MoG and image-generation tasks, including achieving state-of-the-art FID for conditional generation on ImageNet 256×256 with StyleSAN-XL. The approach is simple to apply to a broad class of GANs, with public code, offering a practical path to more stable and effective generative models.
Abstract
Generative adversarial networks (GANs) learn a target probability distribution by optimizing a generator and a discriminator with minimax objectives. This paper addresses the question of whether such optimization actually provides the generator with gradients that make its distribution close to the target distribution. We derive metrizable conditions, sufficient conditions for the discriminator to serve as the distance between the distributions by connecting the GAN formulation with the concept of sliced optimal transport. Furthermore, by leveraging these theoretical results, we propose a novel GAN training scheme, called slicing adversarial network (SAN). With only simple modifications, a broad class of existing GANs can be converted to SANs. Experiments on synthetic and image datasets support our theoretical results and the SAN's effectiveness as compared to usual GANs. Furthermore, we also apply SAN to StyleGAN-XL, which leads to state-of-the-art FID score amongst GANs for class conditional generation on ImageNet 256$\times$256. Our implementation is available on https://ytakida.github.io/san.
