Table of Contents
Fetching ...

Rates of convergence for density estimation with generative adversarial networks

Nikita Puchkin, Sergey Samsonov, Denis Belomestny, Eric Moulines, Alexey Naumov

TL;DR

It is shown that the JS-divergence between the GAN estimate and $\mathsf{p}^*$ decays as fast as $(\log{n}/n)^{2\beta/(2\beta + d)}$, where $n$ is the sample size and $\beta$ determines the smoothness of $\mathsf{p}^*$.

Abstract

In this work we undertake a thorough study of the non-asymptotic properties of the vanilla generative adversarial networks (GANs). We prove an oracle inequality for the Jensen-Shannon (JS) divergence between the underlying density $\mathsf{p}^*$ and the GAN estimate with a significantly better statistical error term compared to the previously known results. The advantage of our bound becomes clear in application to nonparametric density estimation. We show that the JS-divergence between the GAN estimate and $\mathsf{p}^*$ decays as fast as $(\log{n}/n)^{2β/(2β+ d)}$, where $n$ is the sample size and $β$ determines the smoothness of $\mathsf{p}^*$. This rate of convergence coincides (up to logarithmic factors) with minimax optimal for the considered class of densities.

Rates of convergence for density estimation with generative adversarial networks

TL;DR

It is shown that the JS-divergence between the GAN estimate and decays as fast as , where is the sample size and determines the smoothness of .

Abstract

In this work we undertake a thorough study of the non-asymptotic properties of the vanilla generative adversarial networks (GANs). We prove an oracle inequality for the Jensen-Shannon (JS) divergence between the underlying density and the GAN estimate with a significantly better statistical error term compared to the previously known results. The advantage of our bound becomes clear in application to nonparametric density estimation. We show that the JS-divergence between the GAN estimate and decays as fast as , where is the sample size and determines the smoothness of . This rate of convergence coincides (up to logarithmic factors) with minimax optimal for the considered class of densities.

Paper Structure

This paper contains 27 sections, 20 theorems, 260 equations.

Key Result

Theorem 1

Assume assu:G, assu:D, and assu:p. Let $\mathsf{W} \subseteq [-1, 1]^{d_\mathcal{G}}$ and $\mathsf{\Theta} \subseteq [-1, 1]^{d_\mathcal{D}}$. Then, for any $\delta \in (0, 1)$, with probability at least $1 - \delta$, it holds that where and Here $\lesssim$ stands for inequality up to an absolute multiplicative constant.

Theorems & Definitions (23)

  • Theorem 1
  • Remark 1
  • Lemma 1
  • Remark 2
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Theorem 2
  • Remark 3
  • Theorem 3
  • ...and 13 more