Table of Contents
Fetching ...

Breaking the curse of dimensionality in structured density estimation

Robert A. Vandermeulen, Wai Ming Tai, Bryon Aragam

TL;DR

Surprisingly, although one might expect the sample complexity of this problem to scale with local graph parameters such as the degree, this turns out not to be the case and the curse of dimensionality in density estimation can be circumvented.

Abstract

We consider the problem of estimating a structured multivariate density, subject to Markov conditions implied by an undirected graph. In the worst case, without Markovian assumptions, this problem suffers from the curse of dimensionality. Our main result shows how the curse of dimensionality can be avoided or greatly alleviated under the Markov property, and applies to arbitrary graphs. While existing results along these lines focus on sparsity or manifold assumptions, we introduce a new graphical quantity called "graph resilience" and show how it controls the sample complexity. Surprisingly, although one might expect the sample complexity of this problem to scale with local graph parameters such as the degree, this turns out not to be the case. Through explicit examples, we compute uniform deviation bounds and illustrate how the curse of dimensionality in density estimation can thus be circumvented. Notable examples where the rate improves substantially include sequential, hierarchical, and spatial data.

Breaking the curse of dimensionality in structured density estimation

TL;DR

Surprisingly, although one might expect the sample complexity of this problem to scale with local graph parameters such as the degree, this turns out not to be the case and the curse of dimensionality in density estimation can be circumvented.

Abstract

We consider the problem of estimating a structured multivariate density, subject to Markov conditions implied by an undirected graph. In the worst case, without Markovian assumptions, this problem suffers from the curse of dimensionality. Our main result shows how the curse of dimensionality can be avoided or greatly alleviated under the Markov property, and applies to arbitrary graphs. While existing results along these lines focus on sparsity or manifold assumptions, we introduce a new graphical quantity called "graph resilience" and show how it controls the sample complexity. Surprisingly, although one might expect the sample complexity of this problem to scale with local graph parameters such as the degree, this turns out not to be the case. Through explicit examples, we compute uniform deviation bounds and illustrate how the curse of dimensionality in density estimation can thus be circumvented. Notable examples where the rate improves substantially include sequential, hierarchical, and spatial data.

Paper Structure

This paper contains 40 sections, 25 theorems, 73 equations, 5 figures, 1 table.

Key Result

Theorem 3.1

Let $G$ be a graph whose number of vertices is $d$ and resilience is $r$. Let $L\ge 0$. For any $0<\varepsilon<1$, there exists an algorithm that takes $n = \Omega\left(\frac{rd^{r/2+1}L^r}{\varepsilon^{r+2}} \log (\frac{dL}{\varepsilon}) + \frac{\log (1/\delta)}{\varepsilon^2}\right)$ i.i.d. sample

Figures (5)

  • Figure 1: Examples of common structures that yield improvements in density estimation. As indicated by the path example on the left, which is also a tree, the worst-case resilience of any tree is $r=O(\log d)$, but for bounded-depth trees, $r=O(1)$.
  • Figure 2: Heatmaps of the magnitude of the correlation between red pixel and every other pixel, using the CIFAR-10 training set. The left image shows the correlation without conditioning, the right image shows correlation conditioned on the green pixels. We see that the modeling the image as a Markov random grid is valid.
  • Figure 3: Visual representation of the steps of the $3$-disintegration $(\{1,6 \},\{3,5,7 \}, \{2,4 \} )$. In each step of the disintegration, one vertex is removed from every graph component. The total number of steps to the null graph is $3$.
  • Figure 4: Comparison of $L_{3\times 3}$ (left) and $L_{3 \times 3}^2$ (right).
  • Figure 5: Illustration of $V',G_1,G_2,G_3,G_4$ in the proof of Proposition \ref{['prop:grid']}

Theorems & Definitions (53)

  • Definition 3.1
  • Definition 3.2
  • Theorem 3.1
  • Theorem 3.2
  • Lemma 4.1
  • Lemma 4.2
  • Lemma 4.3
  • Lemma 4.4
  • Corollary 4.5
  • Lemma 4.6
  • ...and 43 more