Multivariate Density Estimation via Variance-Reduced Sketching
Yifan Peng, Yuehaw Khoo, Daren Wang
TL;DR
This work addresses nonparametric multivariate density estimation in high dimensions by introducing Variance-Reduced Sketching (VRS), which treats multivariate densities as infinite-size tensors and recovers their range through low-variance moments. The core idea reduces the problem to estimating low-dimensional, informative moments and then reconstructing the density via leading singular functions, achieving a reduced curse of dimensionality with a single-pass algorithm. Theoretical results establish consistency and rate guarantees under a spectral-gap assumption, with error bounds that scale as $O_P\left(\frac{\sqrt{\prod_{j=1}^d r_j}}{N^{\alpha/(2\alpha+1)}} + \xi^*\right)$, and simulations/real-data experiments show VRS outperforming KDEs and neural density estimators across diverse models. The work provides practical algorithms, tuning strategies (including adaptive rank selection), and public code, highlighting VRS's potential for high-dimensional density estimation in science and engineering.
Abstract
Multivariate density estimation is of great interest in various scientific and engineering disciplines. In this work, we introduce a new framework called Variance-Reduced Sketching (VRS), specifically designed to estimate multivariate density functions with a reduced curse of dimensionality. Our VRS framework conceptualizes multivariate functions as infinite-size matrices/tensors, and facilitates a new sketching technique motivated by the numerical linear algebra literature to reduce the variance in density estimation problems. We demonstrate the robust numerical performance of VRS through a series of simulated experiments and real-world data applications. Notably, VRS shows remarkable improvement over existing neural network density estimators and classical kernel methods in numerous distribution models. Additionally, we offer theoretical justifications for VRS to support its ability to deliver density estimation with a reduced curse of dimensionality.
