On the Statistical Capacity of Deep Generative Models

Edric Tam; David B. Dunson

On the Statistical Capacity of Deep Generative Models

Edric Tam, David B. Dunson

TL;DR

The paper analyzes the statistical capacity of deep generative models by deriving non-asymptotic, dimension-free concentration bounds for the outputs of Lipschitz push-forwards under various latent distributions. It shows that with Gaussian latents, the generator error is sub-Gaussian, implying light tails and a lack of universality for heavy-tailed targets; these results extend to log-concave latents, strongly log-concave latents, and latents on manifolds with positive Ricci curvature, and to diffusion models via a reduction to a single Lipschitz transform. The theoretical guarantees are complemented by simulations and financial data illustrating the practical limitation: such models underrepresent tail uncertainty, which matters for anomaly detection and risk-sensitive tasks. The findings motivate exploring richer latent priors or non-Lipschitz generative mechanisms to better capture heavy-tailed phenomena in real data.

Abstract

Deep generative models are routinely used in generating samples from complex, high-dimensional distributions. Despite their apparent successes, their statistical properties are not well understood. A common assumption is that with enough training data and sufficiently large neural networks, deep generative model samples will have arbitrarily small errors in sampling from any continuous target distribution. We set up a unifying framework that debunks this belief. We demonstrate that broad classes of deep generative models, including variational autoencoders and generative adversarial networks, are not universal generators. Under the predominant case of Gaussian latent variables, these models can only generate concentrated samples that exhibit light tails. Using tools from concentration of measure and convex geometry, we give analogous results for more general log-concave and strongly log-concave latent variable distributions. We extend our results to diffusion models via a reduction argument. We use the Gromov--Levy inequality to give similar guarantees when the latent variables lie on manifolds with positive Ricci curvature. These results shed light on the limited capacity of common deep generative models to handle heavy tails. We illustrate the empirical relevance of our work with simulations and financial data.

On the Statistical Capacity of Deep Generative Models

TL;DR

Abstract

Paper Structure (26 sections, 17 theorems, 19 equations, 4 figures)

This paper contains 26 sections, 17 theorems, 19 equations, 4 figures.

Introduction
Related work
Preliminaries
Deep neural networks
Deep generative modeling
Isoperimetry and Concentration of Deep Generative Models
Manifold Setting
Diffusion Models
Simulations and Data Illustration
Discussion
Preliminaries on Concentration
Sub-Gaussian and sub-exponential random vectors
Lipschitz concentration of random vectors
Proofs
Proof of Proposition \ref{['finite_lipschitz']}
...and 11 more sections

Key Result

proposition 1

Finite feed-forward neural networks are Lipschitz with respect to the Euclidean norm.

Figures (4)

Figure 1: Comparisons between Cauchy samples and synthetic samples from a generative adversarial network.
Figure 2: Comparisons between actual returns from Standard and Poor's 500 and Dow Jones Industrial Average indices versus synthetic samples from a generative adversarial network.
Figure 3: Synthetic samples from generative adversarial networks with varying depth and latent variable dimensions
Figure 4: Comparisons between Cauchy samples and synthetic samples from a denoising diffusion model.

Theorems & Definitions (40)

definition 1: Finite feed-forward neural networks
proposition 1
remark 1
theorem 1: Deep Generative Models with Gaussian Latent Variables
theorem 2: Deep Generative Models with Log-concave latent variables
theorem 3: Strongly Log-concave Lipschitz concentration
theorem 4: Concentration of latent variables on manifold
theorem 5: Diffusion Models with Gaussian Latent Variables
definition 2: Sub-Gaussian random variable
proposition 2: Sub-Gaussianity via Orlicz norm
...and 30 more

On the Statistical Capacity of Deep Generative Models

TL;DR

Abstract

On the Statistical Capacity of Deep Generative Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (40)