Table of Contents
Fetching ...

Huber-energy measure quantization

Gabriel Turinici

TL;DR

The results indicate that the HEMQ algorithm is robust and versatile and, for the class of Huber-energy kernels, matches the expected intuitive behavior.

Abstract

We describe a measure quantization procedure i.e., an algorithm which finds the best approximation of a target probability law (and more generally signed finite variation measure) by a sum of $Q$ Dirac masses ($Q$ being the quantization parameter). The procedure is implemented by minimizing the statistical distance between the original measure and its quantized version; the distance is built from a negative definite kernel and, if necessary, can be computed on the fly and feed to a stochastic optimization algorithm (such as SGD, Adam, ...). We investigate theoretically the fundamental questions of existence of the optimal measure quantizer and identify what are the required kernel properties that guarantee suitable behavior. We propose two best linear unbiased (BLUE) estimators for the squared statistical distance and use them in an unbiased procedure, called HEMQ, to find the optimal quantization. We test HEMQ on several databases: multi-dimensional Gaussian mixtures, Wiener space cubature, Italian wine cultivars and the MNIST image database. The results indicate that the HEMQ algorithm is robust and versatile and, for the class of Huber-energy kernels, matches the expected intuitive behavior.

Huber-energy measure quantization

TL;DR

The results indicate that the HEMQ algorithm is robust and versatile and, for the class of Huber-energy kernels, matches the expected intuitive behavior.

Abstract

We describe a measure quantization procedure i.e., an algorithm which finds the best approximation of a target probability law (and more generally signed finite variation measure) by a sum of Dirac masses ( being the quantization parameter). The procedure is implemented by minimizing the statistical distance between the original measure and its quantized version; the distance is built from a negative definite kernel and, if necessary, can be computed on the fly and feed to a stochastic optimization algorithm (such as SGD, Adam, ...). We investigate theoretically the fundamental questions of existence of the optimal measure quantizer and identify what are the required kernel properties that guarantee suitable behavior. We propose two best linear unbiased (BLUE) estimators for the squared statistical distance and use them in an unbiased procedure, called HEMQ, to find the optimal quantization. We test HEMQ on several databases: multi-dimensional Gaussian mixtures, Wiener space cubature, Italian wine cultivars and the MNIST image database. The results indicate that the HEMQ algorithm is robust and versatile and, for the class of Huber-energy kernels, matches the expected intuitive behavior.
Paper Structure (41 sections, 15 theorems, 69 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 41 sections, 15 theorems, 69 equations, 8 figures, 1 table, 1 algorithm.

Key Result

Lemma 5

A function $h$, associated to a positive definite kernel $k$ by eq:relation_h_function_of_k, which satisfies assumption eq:hyp_lowerbound_h and such that $\lim_{x\to \infty} h(x) = \infty$ is measure coercive in the sense of Definition def:measurecoercive. In particular kernels $h^{HE}_{r,a}$ are me

Figures (8)

  • Figure 1: Illustration of the Proposition \ref{['prop:1D_quantiles']}. The points shown are the quantiles required for the quantization with $Q=3$ points of a 1D law.
  • Figure 2: Graphical representation of $f(x,y) = d\left(\frac{\delta_x+\delta_y}{2},\nu\right)^2$. In both cases $f$ is not a convex function. Left : for $\nu_0=\delta_0$ we have $f(x,y)=\frac{2|x|+2|y|-|x-y|}{4}$Right: for $\nu_\pm=\frac{\delta_{-1}+\delta_{1}}{2}$ we have $f(x,y)=\frac{2|x+1|+2|x-1|+2|y+1|+2|y-1|-|x-y|-2}{4}$
  • Figure 3: Quantization of the bi-variate standard normal distribution ($N=2$) with $Q=10$ (left upper panel) $Q=17$ (upper and lower right panels) and $Q=500$ (lower left panel) points using the evolution in \ref{['eq:exponential_decay_ode_normal']}. As expected from the theoretical insights, exponential decay of the distance is obtained. Total simulation time was set to $T=1.75$ for $Q=10$ and $T=1.5$ for $Q=17$. In all cases interesting natural structures appear automatically: when the normal is quantized with $Q=10$ points we observe two concentric rings, one consisting of $3$ points and the other of $7$ points. When $Q=17$ three such rings appear of $2$, $7$ and $8$ points respectively. In general these structures may not be unique as illustrated in the bottom panel where the decomposition is different. We also plot the convergence of the distance squared.
  • Figure 4: Quantization of high dimensional normal distribution. Left panel: $N=11, Q=5$, center panel $N=64, Q=200$. In the right panel we plot a true Brownian simulation (sampling a $N$ dimensional Gaussian). See the GitHub repository gabriel_measure_compression_2022 for the implementation.
  • Figure 5: Quantization for the "Italian wines" benchmark wines_benchmark using the 'energy' kernel. Each data point has $13$ dimensions. On each dimension a standardization was performed. We plot a projection on the first two dimensions. The original data points are in solid circles (colored according to their attributed class), the K-means points are in solid squares and the $Q=3$ quantization points are in triangles. The $\alpha$ parameters are given in the title; note that $\alpha$ is not supposed to correspond to the class distribution. Python implementation is available in the GitHub repository gabriel_measure_compression_2022 .
  • ...and 3 more figures

Theorems & Definitions (63)

  • Remark 1: Convention
  • Remark 2
  • Definition 3: measure coercivity
  • Remark 4
  • Lemma 5
  • proof
  • Lemma 6
  • proof
  • Lemma 7
  • proof
  • ...and 53 more