Table of Contents
Fetching ...

High-Dimensional Learning Dynamics of Quantized Models with Straight-Through Estimator

Yuma Ichikawa, Shuhei Kashiwamura, Ayaka Sakata

TL;DR

This work develops a high-dimensional continuum theory for training jointly quantized weights and inputs with the straight-through estimator (STE). By mapping microscopic parameter updates to a stochastic process and macroscopic states to a deterministic ODE, it reveals a characteristic two-phase learning trajectory—an extended plateau followed by a sharp generalization drop—that is modulated by quantization hyperparameters like bit width and range. The authors provide a fixed-point and stability analysis, derive explicit degradation relative to the unquantized baseline, and extend the framework to nonlinear transformations of weights and inputs. The findings highlight that quantization can act as an implicit regularizer and influence training stability, with practical implications for layer-wise post-training quantization and the design of quantization schedules in deep networks.

Abstract

Quantized neural network training optimizes a discrete, non-differentiable objective. The straight-through estimator (STE) enables backpropagation through surrogate gradients and is widely used. While previous studies have primarily focused on the properties of surrogate gradients and their convergence, the influence of quantization hyperparameters, such as bit width and quantization range, on learning dynamics remains largely unexplored. We theoretically show that in the high-dimensional limit, STE dynamics converge to a deterministic ordinary differential equation. This reveals that STE training exhibits a plateau followed by a sharp drop in generalization error, with plateau length depending on the quantization range. A fixed-point analysis quantifies the asymptotic deviation from the unquantized linear model. We also extend analytical techniques for stochastic gradient descent to nonlinear transformations of weights and inputs.

High-Dimensional Learning Dynamics of Quantized Models with Straight-Through Estimator

TL;DR

This work develops a high-dimensional continuum theory for training jointly quantized weights and inputs with the straight-through estimator (STE). By mapping microscopic parameter updates to a stochastic process and macroscopic states to a deterministic ODE, it reveals a characteristic two-phase learning trajectory—an extended plateau followed by a sharp generalization drop—that is modulated by quantization hyperparameters like bit width and range. The authors provide a fixed-point and stability analysis, derive explicit degradation relative to the unquantized baseline, and extend the framework to nonlinear transformations of weights and inputs. The findings highlight that quantization can act as an implicit regularizer and influence training stability, with practical implications for layer-wise post-training quantization and the design of quantization schedules in deep networks.

Abstract

Quantized neural network training optimizes a discrete, non-differentiable objective. The straight-through estimator (STE) enables backpropagation through surrogate gradients and is widely used. While previous studies have primarily focused on the properties of surrogate gradients and their convergence, the influence of quantization hyperparameters, such as bit width and quantization range, on learning dynamics remains largely unexplored. We theoretically show that in the high-dimensional limit, STE dynamics converge to a deterministic ordinary differential equation. This reveals that STE training exhibits a plateau followed by a sharp drop in generalization error, with plateau length depending on the quantization range. A fixed-point analysis quantifies the asymptotic deviation from the unquantized linear model. We also extend analytical techniques for stochastic gradient descent to nonlinear transformations of weights and inputs.

Paper Structure

This paper contains 53 sections, 29 theorems, 194 equations, 8 figures.

Key Result

Theorem IV.3

Under Assumption asm:learning-dynamics, for any finite $T>0$, the empirical measure $\mu_{t}^{(d)}$ converges weakly to a process $\mu_{\tau}$, which is the law of the solution to the stochastic differential equation: where $(\hat{w}^{\ast}, \hat{w}_{0}) \sim \mu_{0}$; $B_{t}$ is the standard Brownian motion; $\varepsilon_{g}(\tau)$ defined by where $q_{\psi}(\tau) = {\mathbb E}_{\mu_{\tau}}[\ps

Figures (8)

  • Figure 1: Comparison between STE simulations (red histograms) and the PDE prediction (blue curves) for the probability density $\mu_{\tau}(\hat{w}|w^{\ast}=1)$ at training horizons $\tau\in\{10,25,50,100\}$. Simulations use $d=3{,}000$, ${\bm w}^{\ast}={\bm 1}$, and $\mu_{0}$ the standard Gaussian. Both the inputs and weights are quantized with $\omega=2$ and bit-width $b=2$.
  • Figure 2: Generalization error $\varepsilon_g$ as a function of training time $\tau$ for bit widths $b\in\{2,3,4,5\}$, with $\eta=0.04$, $\lambda=1$, $\omega=1.0$, and $d=900$. Symbols with error bars indicate STE simulations averaged over five independent runs; solid curves indicate the ODE prediction.
  • Figure 3: Generalization error $\varepsilon_{g}$ as a function of training time $\tau$ at fixed bit-width $b=3$, for quantization ranges $\omega \in \{0.25, 0.50, 1.00, 1.25, 1.50\}$. Symbols with error bars denote STE simulations averaged over five runs; solid curves denote the ODE prediction.
  • Figure 4: Joint quantization of weights and inputs with the range $\omega=1.0$ and $b \in \{3, 4\}$. Training time $\tau$-dependence of generalization error $\varepsilon_{g}$ at $T = 0$, $d=500$, $\eta=0.05$, $\lambda=1.0$ for input bit widths $b_x \in \{3,4,5\}$, with an unquantized-input one ("No quant."). Symbols with error bars denote STE simulations averaged over five runs; solid curves show the ODE prediction. Top panel: $b=3$; bottom panel: $b=4$.
  • Figure 5: Long-time behavior under input-only quantization with $\lambda=0$ and $\sigma^{2}=0$. Left: Stability boundary at $2/\sigma_{\psi}^{2}$; for $\eta>2/\sigma_{\psi}^{2}$, no asymptotically stable fixed point exists. The dashed line indicates the unquantized baseline. Right: Steady-state generalization error $\varepsilon_{g}^{\ast}$.
  • ...and 3 more figures

Theorems & Definitions (57)

  • Definition IV.1
  • Theorem IV.3
  • Proposition V.2
  • Theorem V.3
  • Proposition VI.1
  • Definition VI.2
  • Theorem VI.3: Informal
  • Lemma I.1
  • proof
  • Lemma I.2
  • ...and 47 more