Generalization error property of infoGAN for two-layer neural network

Mahmud Hasan; Mathias Muia

Generalization error property of infoGAN for two-layer neural network

Mahmud Hasan, Mathias Muia

TL;DR

This work analyzes the generalization of infoGAN in a two-layer neural network by examining the difference between empirical and population objective functions. It derives finite-sample generalization bounds using Rademacher complexity for the discriminator, generator, and their composition under Lipschitz and non-decreasing activations, within a regularized objective that omits a latent code. The main contributions include explicit bounds for two-layer networks, proofs for both activation regimes, and corollaries translating these into practical discrepancy bounds, showing the empirical-generalization gap shrinks with larger sample sizes. The results highlight how network complexity and activation properties govern generalization in adversarial settings and point to future work on tightening bounds and exploring lower bounds.

Abstract

Information Maximizing Generative Adversarial Network (infoGAN) can be understood as a minimax problem involving two neural networks: discriminators and generators with mutual information functions. The infoGAN incorporates various components, including latent variables, mutual information, and objective function. This research demonstrates the Generalization error property of infoGAN as the discriminator and generator sample size approaches infinity. This research explores the generalization error property of InfoGAN as the sample sizes of the discriminator and generator approach infinity. To establish this property, the study considers the difference between the empirical and population versions of the objective function. The error bound is derived from the Rademacher complexity of the discriminator and generator function classes. Additionally, the bound is proven for a two-layer network, where both the discriminator and generator utilize Lipschitz and non-decreasing activation functions.

Generalization error property of infoGAN for two-layer neural network

TL;DR

Abstract

Paper Structure (14 sections, 10 theorems, 42 equations)

This paper contains 14 sections, 10 theorems, 42 equations.

Introduction
Objective Function without Latent Code
Bound of objective function difference
Proof.
Application in a Two-Layer Network
Formation of Two-Layer Network
Bound for Lipschitz Activation Function
Proof.
Proof.
Proof.
Bounding for Non-Decreasing Activation Functions
Proof.
Proof.
Conclusion

Key Result

Theorem 3.1

Suppose the sets of discriminator functions $D$ and $G$ are symmetric with $\lVert f\rVert_{\infty}\leq{\mathbb Q}_{x}$ and $\lVert g\rVert_{\infty}\leq{\mathbb Q}_{Z}$. Then, for any $f\in D$, $g\in G$, with probability at least $1-2\delta$ over the random training sample, we have and

Theorems & Definitions (19)

Theorem 3.1
Remark 3.1
Lemma 4.1
Lemma 4.2
Remark 4.1
Lemma 4.3
Remark 4.2
Corollary 4.1
Remark 4.3
Corollary 4.2
...and 9 more

Generalization error property of infoGAN for two-layer neural network

TL;DR

Abstract

Generalization error property of infoGAN for two-layer neural network

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (19)