Table of Contents
Fetching ...

MS$^3$D: A RG Flow-Based Regularization for GAN Training with Limited Data

Jian Wang, Xin Lan, Yuxin Tian, Jiancheng Lv

TL;DR

This work tackles the problem of training GANs with limited data, where discriminator overfitting leads to degraded generator guidance. It introduces MS$^3$D, a differentiable regularization grounded in renormalization group flow that enforces the discriminator gradient field $\nabla_x f(x;\phi)$ to maintain consistent patterns across scales via multi-scale self-dissimilarity, implemented with Kadanoff block-spin coarse-graining and integrated into the discriminator loss with weight $\lambda$. Across OxfordDog, FFHQ, MetFaces, and BreCaHAD (including FFHQ-2.5K), MS$^3$D improves FID, IS, and KID under limited data, often outperforming or complementing existing regularization and augmentation strategies, and remains effective with transfer learning setups. The method yields more robust training dynamics (lower Fisher information, flatter loss landscapes) and can be combined with augmentation-based approaches for further gains, highlighting a practical, augmentation-free path to better few-shot GAN performance. Overall, MS$^3$D offers a principled, RG-inspired lens on GAN gradient dynamics, with tangible improvements in stability and image quality for limited-data generation tasks.

Abstract

Generative adversarial networks (GANs) have made impressive advances in image generation, but they often require large-scale training data to avoid degradation caused by discriminator overfitting. To tackle this issue, we investigate the challenge of training GANs with limited data, and propose a novel regularization method based on the idea of renormalization group (RG) in physics.We observe that in the limited data setting, the gradient pattern that the generator obtains from the discriminator becomes more aggregated over time. In RG context, this aggregated pattern exhibits a high discrepancy from its coarse-grained versions, which implies a high-capacity and sensitive system, prone to overfitting and collapse. To address this problem, we introduce a \textbf{m}ulti-\textbf{s}cale \textbf{s}tructural \textbf{s}elf-\textbf{d}issimilarity (MS$^3$D) regularization, which constrains the gradient field to have a consistent pattern across different scales, thereby fostering a more redundant and robust system. We show that our method can effectively enhance the performance and stability of GANs under limited data scenarios, and even allow them to generate high-quality images with very few data.

MS$^3$D: A RG Flow-Based Regularization for GAN Training with Limited Data

TL;DR

This work tackles the problem of training GANs with limited data, where discriminator overfitting leads to degraded generator guidance. It introduces MSD, a differentiable regularization grounded in renormalization group flow that enforces the discriminator gradient field to maintain consistent patterns across scales via multi-scale self-dissimilarity, implemented with Kadanoff block-spin coarse-graining and integrated into the discriminator loss with weight . Across OxfordDog, FFHQ, MetFaces, and BreCaHAD (including FFHQ-2.5K), MSD improves FID, IS, and KID under limited data, often outperforming or complementing existing regularization and augmentation strategies, and remains effective with transfer learning setups. The method yields more robust training dynamics (lower Fisher information, flatter loss landscapes) and can be combined with augmentation-based approaches for further gains, highlighting a practical, augmentation-free path to better few-shot GAN performance. Overall, MSD offers a principled, RG-inspired lens on GAN gradient dynamics, with tangible improvements in stability and image quality for limited-data generation tasks.

Abstract

Generative adversarial networks (GANs) have made impressive advances in image generation, but they often require large-scale training data to avoid degradation caused by discriminator overfitting. To tackle this issue, we investigate the challenge of training GANs with limited data, and propose a novel regularization method based on the idea of renormalization group (RG) in physics.We observe that in the limited data setting, the gradient pattern that the generator obtains from the discriminator becomes more aggregated over time. In RG context, this aggregated pattern exhibits a high discrepancy from its coarse-grained versions, which implies a high-capacity and sensitive system, prone to overfitting and collapse. To address this problem, we introduce a \textbf{m}ulti-\textbf{s}cale \textbf{s}tructural \textbf{s}elf-\textbf{d}issimilarity (MSD) regularization, which constrains the gradient field to have a consistent pattern across different scales, thereby fostering a more redundant and robust system. We show that our method can effectively enhance the performance and stability of GANs under limited data scenarios, and even allow them to generate high-quality images with very few data.
Paper Structure (25 sections, 1 theorem, 15 equations, 19 figures, 9 tables)

This paper contains 25 sections, 1 theorem, 15 equations, 19 figures, 9 tables.

Key Result

Lemma 3.1

amari2016information Any standard f-divergence gives the same Riemannian metric $\mathcal{G}$, which is the Fisher information matrix (FIM)

Figures (19)

  • Figure 1: Illustrative examples of renormalization group (RG) flow. The aggregated pattern exhibits high multi-scale structural self-dissimilarity (MS$^3$D), denoted by $\mathcal{D}_\Gamma$.
  • Figure 2: (a) Evolution of discriminator outputs and FID values during training, illustrating the discriminator's overfitting and the resulting degradation of generated samples. (b) Consistent aggregation tendency of the gradient $\nabla_{x}f(x;\phi)$ across various GANs and datasets under limited data conditions.
  • Figure 3: Visualization of the gradient $\nabla_{x}f(x;\phi)$ at different training steps.
  • Figure 4: (a) Under limited data settings, Fisher information increases during training, which indicates a decline in system stability. (b) When data augmentation is applied or the data volume is increased, Fisher information remains low, suggesting enhanced system stability.
  • Figure 5: A diagram illustrating the RG transformation process and MS$^3$D computation. It involves iteratively downsampling $\nabla_{x}f(x;\phi)$ and computing differences before and after downsampling.
  • ...and 14 more figures

Theorems & Definitions (1)

  • Lemma 3.1