Taming the Tail in Class-Conditional GANs: Knowledge Sharing via Unconditional Training at Lower Resolutions
Saeed Khorram, Mingqi Jiang, Mohamad Shahbazi, Mohamad H. Danesh, Li Fuxin
TL;DR
The paper tackles the challenge of generating high-quality images for tail classes in long-tailed, multi-class data with class-conditional GANs. It introduces UTLO, a two-path generative framework where the generator’s lower-resolution path is trained unconditionally to learn class-agnostic features, while higher-resolution layers remain class-conditioned to synthesize detailed, tail-specific outputs. Modifications to both generator and discriminator enable end-to-end training with a combined objective that balances conditional and unconditional signals, and the authors propose tailored evaluation metrics (FID-FS/KID-FS) for tail performance. Across six long-tail benchmarks and multiple GAN architectures, UTLO yields consistent gains in both fidelity and diversity for tail classes, addressing mode collapse and reducing reliance on early stopping, with results suggesting the approach is broadly applicable to other GAN designs.
Abstract
Despite extensive research on training generative adversarial networks (GANs) with limited training data, learning to generate images from long-tailed training distributions remains fairly unexplored. In the presence of imbalanced multi-class training data, GANs tend to favor classes with more samples, leading to the generation of low-quality and less diverse samples in tail classes. In this study, we aim to improve the training of class-conditional GANs with long-tailed data. We propose a straightforward yet effective method for knowledge sharing, allowing tail classes to borrow from the rich information from classes with more abundant training data. More concretely, we propose modifications to existing class-conditional GAN architectures to ensure that the lower-resolution layers of the generator are trained entirely unconditionally while reserving class-conditional generation for the higher-resolution layers. Experiments on several long-tail benchmarks and GAN architectures demonstrate a significant improvement over existing methods in both the diversity and fidelity of the generated images. The code is available at https://github.com/khorrams/utlo.
