A New Formulation of Lipschitz Constrained With Functional Gradient Learning for GANs

Chang Wan; Ke Fan; Xinwei Sun; Yanwei Fu; Minglu Li; Yunliang Jiang; Zhonglong Zheng

A New Formulation of Lipschitz Constrained With Functional Gradient Learning for GANs

Chang Wan, Ke Fan, Xinwei Sun, Yanwei Fu, Minglu Li, Yunliang Jiang, Zhonglong Zheng

TL;DR

Li-CFG presents a Lipschitz-constrained Functional Gradient GAN framework that stabilizes GAN training on large-scale data by tying the discriminator gradient norm to a reduced latent neighborhood size. The key innovation is the $\boldsymbol\varepsilon$-centered gradient penalty, which enlarges the discriminator gradient norm to shrink the latent neighborhood and thereby increase sample diversity, with theoretical guarantees linking gradient penalties to diversity via latent N-size. A main theorem shows the ordering $r_{R_1} > r_{R_0} > r_{\varepsilon}$, implying the $\boldsymbol\varepsilon$-centered GP yields the smallest latent neighborhood and highest diversity, while remaining compatible with CFG dynamics. Empirical results across MNIST, CIFAR-10, LSUN, and ImageNet demonstrate improved stability and diversity over CFG and standard GAN baselines, and the approach generalizes to other GAN models. Overall, Li-CFG offers a theoretically grounded, practical mechanism to control diversity and stability in GAN training through gradient-penalty–driven manipulation of latent space structure.

Abstract

This paper introduces a promising alternative method for training Generative Adversarial Networks (GANs) on large-scale datasets with clear theoretical guarantees. GANs are typically learned through a minimax game between a generator and a discriminator, which is known to be empirically unstable. Previous learning paradigms have encountered mode collapse issues without a theoretical solution. To address these challenges, we propose a novel Lipschitz-constrained Functional Gradient GANs learning (Li-CFG) method to stabilize the training of GAN and provide a theoretical foundation for effectively increasing the diversity of synthetic samples by reducing the neighborhood size of the latent vector. Specifically, we demonstrate that the neighborhood size of the latent vector can be reduced by increasing the norm of the discriminator gradient, resulting in enhanced diversity of synthetic samples. To efficiently enlarge the norm of the discriminator gradient, we introduce a novel ε-centered gradient penalty that amplifies the norm of the discriminator gradient using the hyper-parameter ε. In comparison to other constraints, our method enlarging the discriminator norm, thus obtaining the smallest neighborhood size of the latent vector. Extensive experiments on benchmark datasets for image generation demonstrate the efficacy of the Li-CFG method and the ε-centered gradient penalty. The results showcase improved stability and increased diversity of synthetic samples.

A New Formulation of Lipschitz Constrained With Functional Gradient Learning for GANs

TL;DR

-centered gradient penalty, which enlarges the discriminator gradient norm to shrink the latent neighborhood and thereby increase sample diversity, with theoretical guarantees linking gradient penalties to diversity via latent N-size. A main theorem shows the ordering

, implying the

-centered GP yields the smallest latent neighborhood and highest diversity, while remaining compatible with CFG dynamics. Empirical results across MNIST, CIFAR-10, LSUN, and ImageNet demonstrate improved stability and diversity over CFG and standard GAN baselines, and the approach generalizes to other GAN models. Overall, Li-CFG offers a theoretically grounded, practical mechanism to control diversity and stability in GAN training through gradient-penalty–driven manipulation of latent space structure.

Abstract

Paper Structure (23 sections, 8 theorems, 58 equations, 33 figures, 10 tables)

This paper contains 23 sections, 8 theorems, 58 equations, 33 figures, 10 tables.

Introduction
Preliminary
Methodology
Latent N-size with gradient penalty
--centered GP
Latent N-size with different gradient penalties
Related Work
Experiment
Experimental Results
Conclusion
Declarations
Appendix
Overview
Analysis of the dynamic theory for the CFG
Theoretical analysis of our theory
...and 8 more sections

Key Result

Proposition 3.5

$\mathcal{N}_r\left(\boldsymbol{z}_1\right)$ can be defined with discriminator gradient penalty as follows: , where $\|\nabla_{\boldsymbol x} D_{m}(\mathcal{Y}_2)\|=$$\|\nabla_{\boldsymbol x} D_{m}(G_{\theta_t}(\boldsymbol{z}_2))+R\|$ and $\|\nabla_{\boldsymbol x} D_{m}(\mathcal{Y})\|=$$\|\nabla_{\boldsymbol x} D_{m}(G_{\theta_t}(\boldsymbol{z}))+R\|$. $R$ stands for the discriminator gradient pe

Figures (33)

Figure 1: Highlighting Diversity. We underscore the significance of diversity in image synthesis. The left column (a) and right column (b) display horse label images generated using the CFG and Li-CFG methods trained on the CIFAR-10 dataset, respectively.
Figure 2: CFG method V.S Li-CFG. The left and right figures show the results of the FID and the norm of the discriminator gradient $\|\nabla_{\boldsymbol x} D(\boldsymbol x)\|$ for the CFG method and Li-CFG with different values of the hyper-parameter $\delta(\boldsymbol x)$, respectively. FID is a metric that measures the diversity between synthetic samples and real samples. A lower score is better. More information about FID is present in Section \ref{['FID-explain']}. The hyper-parameter $\delta(\boldsymbol x)$ is important as it controls the gradient magnitude for the CFG method, which is defined in Eq. (\ref{['eq:delta_x']}). Solid and dashed lines of the same color in both figures indicate FID and $\|\nabla_{\boldsymbol x} D(\boldsymbol x)\|$ result of CFG method and Li-CFG with the same $\delta(\boldsymbol x)$.
Figure 3: Main idea of the relationship between latent N-size and constraint .
Figure 4: Results for CIFAR10, MNIST: The most left four columns are CFG method, the second four columns are Li-CFG with $\boldsymbol\varepsilon$-centered GP(ours), the third four columns are Li-CFG with $1$-centered, the most right four columns are Li-CFG with $0$-centered.
Figure 5: Results for LSUN tower, Church, B, BR+LR, T+B, ImageNet: The method from top to bottom are CFG method, Li-CFG with $\boldsymbol\varepsilon$-centered GP(ours), Li-CFG with $1$-centered and Li-CFG with $0$-centered in each two rows.
...and 28 more figures

Theorems & Definitions (23)

Example 1.1
Definition 3.1: Latent Neighborhood Size
Definition 3.2: Modes in Image Space
Definition 3.3: Modes Attracted
Definition 3.4: Modes Distracted
Proposition 3.5
Lemma 3.6
Theorem 3.7
Lemma 2.1
proof
...and 13 more

A New Formulation of Lipschitz Constrained With Functional Gradient Learning for GANs

TL;DR

Abstract

A New Formulation of Lipschitz Constrained With Functional Gradient Learning for GANs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (33)

Theorems & Definitions (23)