Huber-energy measure quantization

Gabriel Turinici

Huber-energy measure quantization

Gabriel Turinici

TL;DR

The results indicate that the HEMQ algorithm is robust and versatile and, for the class of Huber-energy kernels, matches the expected intuitive behavior.

Abstract

We describe a measure quantization procedure i.e., an algorithm which finds the best approximation of a target probability law (and more generally signed finite variation measure) by a sum of $Q$ Dirac masses ($Q$ being the quantization parameter). The procedure is implemented by minimizing the statistical distance between the original measure and its quantized version; the distance is built from a negative definite kernel and, if necessary, can be computed on the fly and feed to a stochastic optimization algorithm (such as SGD, Adam, ...). We investigate theoretically the fundamental questions of existence of the optimal measure quantizer and identify what are the required kernel properties that guarantee suitable behavior. We propose two best linear unbiased (BLUE) estimators for the squared statistical distance and use them in an unbiased procedure, called HEMQ, to find the optimal quantization. We test HEMQ on several databases: multi-dimensional Gaussian mixtures, Wiener space cubature, Italian wine cultivars and the MNIST image database. The results indicate that the HEMQ algorithm is robust and versatile and, for the class of Huber-energy kernels, matches the expected intuitive behavior.

Huber-energy measure quantization

TL;DR

The results indicate that the HEMQ algorithm is robust and versatile and, for the class of Huber-energy kernels, matches the expected intuitive behavior.

Abstract

We describe a measure quantization procedure i.e., an algorithm which finds the best approximation of a target probability law (and more generally signed finite variation measure) by a sum of

Dirac masses (

being the quantization parameter). The procedure is implemented by minimizing the statistical distance between the original measure and its quantized version; the distance is built from a negative definite kernel and, if necessary, can be computed on the fly and feed to a stochastic optimization algorithm (such as SGD, Adam, ...). We investigate theoretically the fundamental questions of existence of the optimal measure quantizer and identify what are the required kernel properties that guarantee suitable behavior. We propose two best linear unbiased (BLUE) estimators for the squared statistical distance and use them in an unbiased procedure, called HEMQ, to find the optimal quantization. We test HEMQ on several databases: multi-dimensional Gaussian mixtures, Wiener space cubature, Italian wine cultivars and the MNIST image database. The results indicate that the HEMQ algorithm is robust and versatile and, for the class of Huber-energy kernels, matches the expected intuitive behavior.

Paper Structure (41 sections, 15 theorems, 69 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 41 sections, 15 theorems, 69 equations, 8 figures, 1 table, 1 algorithm.

Introduction
Motivation
Relationship with the literature
Vector quantization
Kernel vector quantization and "neural gas"
Existence of the optimal measure quantifier
Notations
Measure coercivity
Existence of the optimal quantizer for unbounded kernels
Existence of the measure quantization for the Gaussian kernel
Statistical consistency and the best linear unbiased estimator (BLUE) of the squared distance
Further theoretical results
Mean distance, decay rate
Uniqueness of the weights
Exact solution in 1D for the 'energy' kernel
...and 26 more sections

Key Result

Lemma 5

A function $h$, associated to a positive definite kernel $k$ by eq:relation_h_function_of_k, which satisfies assumption eq:hyp_lowerbound_h and such that $\lim_{x\to \infty} h(x) = \infty$ is measure coercive in the sense of Definition def:measurecoercive. In particular kernels $h^{HE}_{r,a}$ are me

Figures (8)

Figure 1: Illustration of the Proposition \ref{['prop:1D_quantiles']}. The points shown are the quantiles required for the quantization with $Q=3$ points of a 1D law.
Figure 2: Graphical representation of $f(x,y) = d\left(\frac{\delta_x+\delta_y}{2},\nu\right)^2$. In both cases $f$ is not a convex function. Left : for $\nu_0=\delta_0$ we have $f(x,y)=\frac{2|x|+2|y|-|x-y|}{4}$Right: for $\nu_\pm=\frac{\delta_{-1}+\delta_{1}}{2}$ we have $f(x,y)=\frac{2|x+1|+2|x-1|+2|y+1|+2|y-1|-|x-y|-2}{4}$
Figure 3: Quantization of the bi-variate standard normal distribution ($N=2$) with $Q=10$ (left upper panel) $Q=17$ (upper and lower right panels) and $Q=500$ (lower left panel) points using the evolution in \ref{['eq:exponential_decay_ode_normal']}. As expected from the theoretical insights, exponential decay of the distance is obtained. Total simulation time was set to $T=1.75$ for $Q=10$ and $T=1.5$ for $Q=17$. In all cases interesting natural structures appear automatically: when the normal is quantized with $Q=10$ points we observe two concentric rings, one consisting of $3$ points and the other of $7$ points. When $Q=17$ three such rings appear of $2$, $7$ and $8$ points respectively. In general these structures may not be unique as illustrated in the bottom panel where the decomposition is different. We also plot the convergence of the distance squared.
Figure 4: Quantization of high dimensional normal distribution. Left panel: $N=11, Q=5$, center panel $N=64, Q=200$. In the right panel we plot a true Brownian simulation (sampling a $N$ dimensional Gaussian). See the GitHub repository gabriel_measure_compression_2022 for the implementation.
Figure 5: Quantization for the "Italian wines" benchmark wines_benchmark using the 'energy' kernel. Each data point has $13$ dimensions. On each dimension a standardization was performed. We plot a projection on the first two dimensions. The original data points are in solid circles (colored according to their attributed class), the K-means points are in solid squares and the $Q=3$ quantization points are in triangles. The $\alpha$ parameters are given in the title; note that $\alpha$ is not supposed to correspond to the class distribution. Python implementation is available in the GitHub repository gabriel_measure_compression_2022 .
...and 3 more figures

Theorems & Definitions (63)

Remark 1: Convention
Remark 2
Definition 3: measure coercivity
Remark 4
Lemma 5
proof
Lemma 6
proof
Lemma 7
proof
...and 53 more

Huber-energy measure quantization

TL;DR

Abstract

Huber-energy measure quantization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (63)