Table of Contents
Fetching ...

The Lattice Geometry of Neural Network Quantization -- A Short Equivalence Proof of GPTQ and Babai's Algorithm

Johann Birnick

TL;DR

It is proved that the GPTQ algorithm is equivalent to Babai's well-known nearest-plane algorithm, and geometric intuition for both algorithms is provided.

Abstract

We explain how data-driven quantization of a linear unit in a neural network corresponds to solving the closest vector problem for a certain lattice generated by input data. We prove that the GPTQ algorithm is equivalent to Babai's well-known nearest-plane algorithm. We furthermore provide geometric intuition for both algorithms. Lastly, we note the consequences of these results, in particular hinting at the possibility of using lattice basis reduction for improved quantization.

The Lattice Geometry of Neural Network Quantization -- A Short Equivalence Proof of GPTQ and Babai's Algorithm

TL;DR

It is proved that the GPTQ algorithm is equivalent to Babai's well-known nearest-plane algorithm, and geometric intuition for both algorithms is provided.

Abstract

We explain how data-driven quantization of a linear unit in a neural network corresponds to solving the closest vector problem for a certain lattice generated by input data. We prove that the GPTQ algorithm is equivalent to Babai's well-known nearest-plane algorithm. We furthermore provide geometric intuition for both algorithms. Lastly, we note the consequences of these results, in particular hinting at the possibility of using lattice basis reduction for improved quantization.

Paper Structure

This paper contains 20 sections, 3 theorems, 14 equations, 3 figures.

Key Result

Theorem 2.1

The procedures GPTQ and Babai are equivalent. That is, for any $X \in \mathbb{R}^{k \times n}$ (of full column rank) and $w \in \mathbb{R}^n$, they produce the same output $v \in \mathbb{Z}^n$.

Figures (3)

  • Figure 1: Two spaces at play. $X$ embeds $\mathbb{R}^n$ into $\mathbb{R}^k$, mapping the quantization grid $\mathbb{Z}^n$ to a lattice in $\mathbb{R}^k$. GPTQ works in $\mathbb{R}^n$, on the left. Babai's algorithm works in $\mathbb{R}^k$, on the right.
  • Figure 2: GPTQ fixes $v_1 := \mathop{\mathrm{round}}\nolimits(w_1)$. This restricts $v$ to lie on the orange plane. It defines a new target weight $w'$ on the orange line, and then proceeds recursively. The target weight $w'$ is not just an orthogonal projection to the orange plane. Instead, the update step implicitly uses the geometry from the lattice; in $\mathbb{R}^k$ on the right it indeed corresponds to an orthogonal projection.
  • Figure 3: Babai's algorithm looks for the nearest plane (parallel to the orange or green plane) to $t$. It identifies the orange plane, and then subtracts an appropriate integer multiple of $X_1$ from $t$, leading to the green point $t'$. It then recursively looks for a lattice point in the green sublattice which is close to the green point $t'$.

Theorems & Definitions (5)

  • Remark
  • Theorem 2.1
  • proof
  • Theorem 3.1: babai1986lovasz
  • Theorem 3.2: babai1986lovasz