Table of Contents
Fetching ...

Voronoi Density Estimator for High-Dimensional Data: Computation, Compactification and Convergence

Vladislav Polianskii, Giovanni Luca Marchetti, Alexander Kravberg, Anastasiia Varava, Florian T. Pokorny, Danica Kragic

TL;DR

This work introduces the Compactified Voronoi Density Estimator (CVDE) to overcome high-dimensional limitations of the classical Voronoi Density Estimator by compactifying Voronoi cells with a kernel-based measure. It develops efficient computational procedures via spherical-volume integration and hit-and-run sampling, with GPU-accelerated implementation, and proves convergence of the CVDE to the true density without requiring kernel bandwidth vanishing. Empirical results show CVDE outperforms KDE on synthetic and image-like data and remains effective in high dimensions where traditional VDE struggles. The approach offers a geometry-adaptive, nonparametric density estimator with favorable convergence properties and practical scalability for complex, high-dimensional data.

Abstract

The Voronoi Density Estimator (VDE) is an established density estimation technique that adapts to the local geometry of data. However, its applicability has been so far limited to problems in two and three dimensions. This is because Voronoi cells rapidly increase in complexity as dimensions grow, making the necessary explicit computations infeasible. We define a variant of the VDE deemed Compactified Voronoi Density Estimator (CVDE), suitable for higher dimensions. We propose computationally efficient algorithms for numerical approximation of the CVDE and formally prove convergence of the estimated density to the original one. We implement and empirically validate the CVDE through a comparison with the Kernel Density Estimator (KDE). Our results indicate that the CVDE outperforms the KDE on sound and image data.

Voronoi Density Estimator for High-Dimensional Data: Computation, Compactification and Convergence

TL;DR

This work introduces the Compactified Voronoi Density Estimator (CVDE) to overcome high-dimensional limitations of the classical Voronoi Density Estimator by compactifying Voronoi cells with a kernel-based measure. It develops efficient computational procedures via spherical-volume integration and hit-and-run sampling, with GPU-accelerated implementation, and proves convergence of the CVDE to the true density without requiring kernel bandwidth vanishing. Empirical results show CVDE outperforms KDE on synthetic and image-like data and remains effective in high dimensions where traditional VDE struggles. The approach offers a geometry-adaptive, nonparametric density estimator with favorable convergence properties and practical scalability for complex, high-dimensional data.

Abstract

The Voronoi Density Estimator (VDE) is an established density estimation technique that adapts to the local geometry of data. However, its applicability has been so far limited to problems in two and three dimensions. This is because Voronoi cells rapidly increase in complexity as dimensions grow, making the necessary explicit computations infeasible. We define a variant of the VDE deemed Compactified Voronoi Density Estimator (CVDE), suitable for higher dimensions. We propose computationally efficient algorithms for numerical approximation of the CVDE and formally prove convergence of the estimated density to the original one. We implement and empirically validate the CVDE through a comparison with the Kernel Density Estimator (KDE). Our results indicate that the CVDE outperforms the KDE on sound and image data.
Paper Structure (16 sections, 6 theorems, 23 equations, 9 figures, 2 algorithms)

This paper contains 16 sections, 6 theorems, 23 equations, 9 figures, 2 algorithms.

Key Result

Theorem 4.1

Suppose that $\rho$ has support in the whole $\mathbb{R}^n$. For any $K \in L^1(\mathbb{R}^n \times \mathbb{R}^n)$ the sequence of random probability measures $\mathbb{P}_m$ converges to $\mathbb{P}$ in distribution w.r.t. $x$ and in probability w.r.t. $P$. Namely, for any measurable set $E \subsete

Figures (9)

  • Figure 1: Graph of a density estimated by the CVDE, with the Voronoi tessellation underneath.
  • Figure 2: Comparison between VDE and CVDE for generators in the plane. A darker color represents higher estimated density.
  • Figure 3: Voronoi tessellation for generators distributed on a submanifold (a parabola). In this case, all the Voronoi cells are unbounded and the VDE is strongly biased by the choice of the bounding region $A$.
  • Figure 4: An illustration of the directional radius involved in volume estimation and sampling.
  • Figure 5: An illustration of the hit-and-run sampling procedure, with a trajectory of length $I=4$ for each generator. The sampled points are displayed in orange.
  • ...and 4 more figures

Theorems & Definitions (14)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Theorem 4.1
  • proof
  • Proposition 4.2
  • proof
  • Theorem D.1
  • Proposition D.2
  • Proposition D.3
  • ...and 4 more