Table of Contents
Fetching ...

Fast Kd-trees for the Kullback--Leibler Divergence and other Decomposable Bregman Divergences

Tuyen Pham, Hubert Wagner

TL;DR

This paper extends the classic Kd-tree data structure to spaces governed by decomposable Bregman divergences, proving that pruning correctness does not require the triangle inequality. It introduces a practical C++ implementation for exact and guaranteed approximate nearest neighbor queries under divergences such as Kullback--Leibler, with an O(1) time update for projection-based pruning made possible by the decomposable structure. Theoretical results establish Ball connectivity and efficient projection updates via the Legendre transform, while empirical results show substantial speedups over naive search and competitive performance against specialized baselines across moderate to high dimensions. The work broadens the applicability of computational geometry methods to non-metric similarity measures encountered in probability vectors and machine learning, and highlights new open questions about expected query complexity in Bregman geometries.

Abstract

The contributions of the paper span theoretical and implementational results. First, we prove that Kd-trees can be extended to spaces in which the distance is measured with an arbitrary Bregman divergence. Perhaps surprisingly, this shows that the triangle inequality is not necessary for correct pruning in Kd-trees. Second, we offer an efficient algorithm and C++ implementation for nearest neighbour search for decomposable Bregman divergences. The implementation supports the Kullback--Leibler divergence (relative entropy) which is a popular distance between probability vectors and is commonly used in statistics and machine learning. This is a step toward broadening the usage of computational geometry algorithms. Our benchmarks show that our implementation efficiently handles both exact and approximate nearest neighbour queries. Compared to a naive approach, we achieve two orders of magnitude speedup for practical scenarios in dimension up to 100. Our solution is simpler and more efficient than competing methods.

Fast Kd-trees for the Kullback--Leibler Divergence and other Decomposable Bregman Divergences

TL;DR

This paper extends the classic Kd-tree data structure to spaces governed by decomposable Bregman divergences, proving that pruning correctness does not require the triangle inequality. It introduces a practical C++ implementation for exact and guaranteed approximate nearest neighbor queries under divergences such as Kullback--Leibler, with an O(1) time update for projection-based pruning made possible by the decomposable structure. Theoretical results establish Ball connectivity and efficient projection updates via the Legendre transform, while empirical results show substantial speedups over naive search and competitive performance against specialized baselines across moderate to high dimensions. The work broadens the applicability of computational geometry methods to non-metric similarity measures encountered in probability vectors and machine learning, and highlights new open questions about expected query complexity in Bregman geometries.

Abstract

The contributions of the paper span theoretical and implementational results. First, we prove that Kd-trees can be extended to spaces in which the distance is measured with an arbitrary Bregman divergence. Perhaps surprisingly, this shows that the triangle inequality is not necessary for correct pruning in Kd-trees. Second, we offer an efficient algorithm and C++ implementation for nearest neighbour search for decomposable Bregman divergences. The implementation supports the Kullback--Leibler divergence (relative entropy) which is a popular distance between probability vectors and is commonly used in statistics and machine learning. This is a step toward broadening the usage of computational geometry algorithms. Our benchmarks show that our implementation efficiently handles both exact and approximate nearest neighbour queries. Compared to a naive approach, we achieve two orders of magnitude speedup for practical scenarios in dimension up to 100. Our solution is simpler and more efficient than competing methods.

Paper Structure

This paper contains 17 sections, 7 theorems, 6 equations, 5 figures, 6 tables, 1 algorithm.

Key Result

Lemma 1

Given a nonempty closed convex set $C\subset\Omega$ and $q\in \Omega$, denote $q_C = \operatorname{proj}_F(q, C)$. For all $x\in C$: If $C$ is an affine subspace, the above is an equality.

Figures (5)

  • Figure 1: Visualization of a Bregman divergence construction for a one-dimensional domain.
  • Figure 2: Left: primal Itakura--Saito balls. Right: primal generalized Kullback--Leibler balls.
  • Figure 3: Left: Calculation of projection divergences of each point $q_i$ onto the box decomposed as the sum of divergence computations along individual dimension. Right: Efficient update of projection divergence.
  • Figure 4: Three nearest neighbours of a query $q$ with respect to the KL divergence and the Euclidean distance on $\triangle^{2}$. The blue area is the KL ball and the yellow is the Euclidean ball whose radius is determined by the respective third nearest neighbour.
  • Figure 5: Total query time compared for $(1+\epsilon)$-approximate nearest neighbours for tst$_{1}\to$trn$_{2}$ (lower is better). Left is KL and right is IS divergence. Starting from $\epsilon = 0.1$. Vertical bars mark the speed up of Kd-trees over ball-trees for a given $\epsilon$.

Theorems & Definitions (7)

  • Lemma 1: Bregman Projection Bregman_Voronoi
  • Lemma 2: Connectedness
  • Lemma 3: Boundary Intersection
  • Lemma 4
  • Lemma 5: Axis-Aligned Projection
  • Corollary 1: Box Projection Divergence
  • Lemma 6: Updating Projection Divergence in Constant Time