Table of Contents
Fetching ...

Optimal rates for density and mode estimation with expand-and-sparsify representations

Kaushik Sinha, Christopher Tosh

TL;DR

This work studies expand-and-sparsify representations, where a dense input is projected to a high-dimensional space and sparsified to a $k$-sparse binary vector, for non-parametric density and mode estimation on ${\mathcal{S}}^{d-1}$. It introduces a linear-in-representation density estimator ${\hat f}_n$ derived from random projections and a sparse mask, and proves minimax-optimal $\ell_{\infty}$ rates $O\left((\log n / n)^{\frac{\beta}{2\beta+(d-1)}}\right)$ for $L,\beta$-smooth densities, with matching lower bounds. For mode estimation, simple algorithms atop the density estimator recover a unimodal mode at rate $\tilde{O}\left(n^{-1/(d+3)}\right)$ and, under mild separation, recover salient modes with the same rate up to logarithmic factors in the multi-modal setting. Empirical results on mixtures of von Mises-Fisher distributions illustrate the method’s behavior relative to KDE and kNN-based density estimators as expansion increases. Overall, the paper provides a novel algorithmic bridge between sparse, high-dimensional representations and classical non-parametric estimation, offering minimax-optimal guarantees and practical procedures on spherical data.

Abstract

Expand-and-sparsify representations are a class of theoretical models that capture sparse representation phenomena observed in the sensory systems of many animals. At a high level, these representations map an input $x \in \mathbb{R}^d$ to a much higher dimension $m \gg d$ via random linear projections before zeroing out all but the $k \ll m$ largest entries. The result is a $k$-sparse vector in $\{0,1\}^m$. We study the suitability of this representation for two fundamental statistical problems: density estimation and mode estimation. For density estimation, we show that a simple linear function of the expand-and-sparsify representation produces an estimator with minimax-optimal $\ell_{\infty}$ convergence rates. In mode estimation, we provide simple algorithms on top of our density estimator that recover single or multiple modes at optimal rates up to logarithmic factors under mild conditions.

Optimal rates for density and mode estimation with expand-and-sparsify representations

TL;DR

This work studies expand-and-sparsify representations, where a dense input is projected to a high-dimensional space and sparsified to a -sparse binary vector, for non-parametric density and mode estimation on . It introduces a linear-in-representation density estimator derived from random projections and a sparse mask, and proves minimax-optimal rates for -smooth densities, with matching lower bounds. For mode estimation, simple algorithms atop the density estimator recover a unimodal mode at rate and, under mild separation, recover salient modes with the same rate up to logarithmic factors in the multi-modal setting. Empirical results on mixtures of von Mises-Fisher distributions illustrate the method’s behavior relative to KDE and kNN-based density estimators as expansion increases. Overall, the paper provides a novel algorithmic bridge between sparse, high-dimensional representations and classical non-parametric estimation, offering minimax-optimal guarantees and practical procedures on spherical data.

Abstract

Expand-and-sparsify representations are a class of theoretical models that capture sparse representation phenomena observed in the sensory systems of many animals. At a high level, these representations map an input to a much higher dimension via random linear projections before zeroing out all but the largest entries. The result is a -sparse vector in . We study the suitability of this representation for two fundamental statistical problems: density estimation and mode estimation. For density estimation, we show that a simple linear function of the expand-and-sparsify representation produces an estimator with minimax-optimal convergence rates. In mode estimation, we provide simple algorithms on top of our density estimator that recover single or multiple modes at optimal rates up to logarithmic factors under mild conditions.
Paper Structure (31 sections, 20 theorems, 94 equations, 3 figures, 1 algorithm)

This paper contains 31 sections, 20 theorems, 94 equations, 3 figures, 1 algorithm.

Key Result

Lemma 5.2

If $\delta > 0$ and $m > k > C_\delta^2 d \log m$, then with probability at least $1-\delta$ the following holds. For every $j=1,\ldots,m$: where $C_\delta = c_0 \log(2/\delta)$ for some absolute constant $c_0$ and

Figures (3)

  • Figure 1: Visualization of the expand-and-sparsify architecture dasgupta2020expressivity.
  • Figure 2: Color code: Red -- KNNDE, Black -- KDE, BLUE (error bar) -- EaSDE.
  • Figure 3: Individual components and a mixture of von Mises-Fisher distributions on $\mathcal{S}^2$. The left panel shows probability density of von Mises-Fisher distribution with parameters $\mu_1 = \left(-\sqrt{0.9},-\sqrt{0.1},0\right)\in\mathcal{S}^2$ and $\kappa_1 = 10$. The middle panel shows the probability density function of von Mises-Fisher distribution with parameters $\mu_2 = \left(-\sqrt{0.01}, -\sqrt{0.99}, 0\right)\in\mathcal{S}^2$ and $\kappa_2 = 5$. The right panel shows the probability density function of a mixture of these two von Mises-Fisher distributions where the mixing coefficient $w=0.3$. In all these picture bright yellow color represents high density regions and the shifting shades from yellow to green represents gradual low density regions.

Theorems & Definitions (44)

  • Definition 5.1
  • Lemma 5.2
  • Lemma 5.3
  • Lemma 5.4
  • Definition 5.5
  • Lemma 5.6
  • Theorem 5.7
  • proof : Proof sketch
  • Theorem 5.8
  • proof : Proof sketch
  • ...and 34 more