Table of Contents
Fetching ...

HingeRLC-GAN: Combating Mode Collapse with Hinge Loss and RLC Regularization

Osman Goni, Himadri Saha Arka, Mithun Halder, Mir Moynuddin Ahmed Shibly, Swakkhar Shatabda

TL;DR

The study targets mode collapse in GANs, especially on small datasets, by combining a ResNet-based architecture with a novel loss-regularization duo. The proposed HingeRLC-GAN merges Hinge Loss with Regularized Loss Control (RLC) to stabilize training and encourage coverage of additional data modes, supported by a theoretical framework that links gradient regularization to improved mode diversity. Empirically, the method achieves state-of-the-art Fréchet Inception Distance and Kernel Inception Distance on CIFAR, demonstrates a ~30% gain in mode capture over a strong baseline, and exhibits stable training dynamics with meaningful multi-class samples. This approach offers practical improvements for generating diverse, high-fidelity images, particularly in data-scarce regimes and could be extended to other vision tasks.

Abstract

Recent advances in Generative Adversarial Networks (GANs) have demonstrated their capability for producing high-quality images. However, a significant challenge remains mode collapse, which occurs when the generator produces a limited number of data patterns that do not reflect the diversity of the training dataset. This study addresses this issue by proposing a number of architectural changes aimed at increasing the diversity and stability of GAN models. We start by improving the loss function with Wasserstein loss and Gradient Penalty to better capture the full range of data variations. We also investigate various network architectures and conclude that ResNet significantly contributes to increased diversity. Building on these findings, we introduce HingeRLC-GAN, a novel approach that combines RLC Regularization and the Hinge loss function. With a FID Score of 18 and a KID Score of 0.001, our approach outperforms existing methods by effectively balancing training stability and increased diversity.

HingeRLC-GAN: Combating Mode Collapse with Hinge Loss and RLC Regularization

TL;DR

The study targets mode collapse in GANs, especially on small datasets, by combining a ResNet-based architecture with a novel loss-regularization duo. The proposed HingeRLC-GAN merges Hinge Loss with Regularized Loss Control (RLC) to stabilize training and encourage coverage of additional data modes, supported by a theoretical framework that links gradient regularization to improved mode diversity. Empirically, the method achieves state-of-the-art Fréchet Inception Distance and Kernel Inception Distance on CIFAR, demonstrates a ~30% gain in mode capture over a strong baseline, and exhibits stable training dynamics with meaningful multi-class samples. This approach offers practical improvements for generating diverse, high-fidelity images, particularly in data-scarce regimes and could be extended to other vision tasks.

Abstract

Recent advances in Generative Adversarial Networks (GANs) have demonstrated their capability for producing high-quality images. However, a significant challenge remains mode collapse, which occurs when the generator produces a limited number of data patterns that do not reflect the diversity of the training dataset. This study addresses this issue by proposing a number of architectural changes aimed at increasing the diversity and stability of GAN models. We start by improving the loss function with Wasserstein loss and Gradient Penalty to better capture the full range of data variations. We also investigate various network architectures and conclude that ResNet significantly contributes to increased diversity. Building on these findings, we introduce HingeRLC-GAN, a novel approach that combines RLC Regularization and the Hinge loss function. With a FID Score of 18 and a KID Score of 0.001, our approach outperforms existing methods by effectively balancing training stability and increased diversity.

Paper Structure

This paper contains 20 sections, 18 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Mode Coverage: DROPOUT-GAN (left),HingeRLC-GAN (right). The proposed method is performing up to 30% better in mode capture.
  • Figure 2: Artifact produced by 'Conv2dTranspose' layers with checkerboard patterns.
  • Figure 3: HingeRLC-GAN Architecture: An illustrative example of the ResNetRLC GAN's internal generator and discriminator workings
  • Figure 4: t-SNE visualization of the CIFAR-10 dataset images.
  • Figure 5: t-SNE Visualizations: (left) DROPOUT-GAN, (right) HingeRLC-GAN. Mode coverage is 30% better then DROPOUT-GAN
  • ...and 2 more figures