Table of Contents
Fetching ...

Oscillations Make Neural Networks Robust to Quantization

Jonathan Wenshøj, Bob Pepin, Raghavendra Selvan

TL;DR

The paper reframes weight oscillations during QAT from a detrimental side-effect to a core mechanism enabling quantization robustness. Through a univariate toy model, it reveals an oscillation-driving gradient tied to pushing weights toward quantization thresholds and introduces an oscillation-inducing regularizer that replicates this behavior in neural networks. Empirical results on CIFAR-10 and Tiny ImageNet with ResNet-18 and Tiny ViT show that oscillations induced during training can recover QAT-level accuracy at 3–4 bits and offer strong cross-bit robustness, sometimes outperforming QAT in unseen bit-widths. These findings provide a deeper understanding of QAT dynamics and suggest oscillations as a constructive tool for efficient, robust quantization across varying bit widths and quantizers.

Abstract

We challenge the prevailing view that weight oscillations observed during Quantization Aware Training (QAT) are merely undesirable side-effects and argue instead that they are an essential part of QAT. We show in a univariate linear model that QAT results in an additional loss term that causes oscillations by pushing weights away from their nearest quantization level. Based on the mechanism from the analysis, we then derive a regularizer that induces oscillations in the weights of neural networks during training. Our empirical results on ResNet-18 and Tiny Vision Transformer, evaluated on CIFAR-10 and Tiny ImageNet datasets, demonstrate across a range of quantization levels that training with oscillations followed by post-training quantization (PTQ) is sufficient to recover the performance of QAT in most cases. With this work we provide further insight into the dynamics of QAT and contribute a novel insight into explaining the role of oscillations in QAT which until now have been considered to have a primarily negative effect on quantization.

Oscillations Make Neural Networks Robust to Quantization

TL;DR

The paper reframes weight oscillations during QAT from a detrimental side-effect to a core mechanism enabling quantization robustness. Through a univariate toy model, it reveals an oscillation-driving gradient tied to pushing weights toward quantization thresholds and introduces an oscillation-inducing regularizer that replicates this behavior in neural networks. Empirical results on CIFAR-10 and Tiny ImageNet with ResNet-18 and Tiny ViT show that oscillations induced during training can recover QAT-level accuracy at 3–4 bits and offer strong cross-bit robustness, sometimes outperforming QAT in unseen bit-widths. These findings provide a deeper understanding of QAT dynamics and suggest oscillations as a constructive tool for efficient, robust quantization across varying bit widths and quantizers.

Abstract

We challenge the prevailing view that weight oscillations observed during Quantization Aware Training (QAT) are merely undesirable side-effects and argue instead that they are an essential part of QAT. We show in a univariate linear model that QAT results in an additional loss term that causes oscillations by pushing weights away from their nearest quantization level. Based on the mechanism from the analysis, we then derive a regularizer that induces oscillations in the weights of neural networks during training. Our empirical results on ResNet-18 and Tiny Vision Transformer, evaluated on CIFAR-10 and Tiny ImageNet datasets, demonstrate across a range of quantization levels that training with oscillations followed by post-training quantization (PTQ) is sufficient to recover the performance of QAT in most cases. With this work we provide further insight into the dynamics of QAT and contribute a novel insight into explaining the role of oscillations in QAT which until now have been considered to have a primarily negative effect on quantization.

Paper Structure

This paper contains 35 sections, 22 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Oscillatory behavior during QAT in a one weight linear model $\hat{y} = q(w)x$, with weight $w$, quantized weight $q(w)$, input $x = 1$ and target $y = 0.75$. The gradient of the loss term of this model during QAT can be decomposed into two terms: $\nabla L(w)$ and $\nabla \delta_L = q(w)-w$, where the latter term is what differentiates QAT from just optimizing the full-precision loss $L(w)$. During QAT, $\nabla L(w)$ always points towards $y$, while $\nabla \delta_L$ introduces a dynamic which pushes $w$ towards the nearest bin threshold. This causes $w$ to oscillate when $y$ is not exactly on a quantization level. In the above case this makes $q(w)$ alternate between 0 and 1. Note the frequency of oscillations of $q(w)$ lets the quantized weight on average to converge to 0.75.
  • Figure 2: Weight distribution analysis of ResNet-18's first convolutional layer after 50 epochs of training from scratch. a) Weight distribution under QAT with a 3-bit quantizer. b)-d) Our proposed regularization approach with a 3-bit quantizer at varying regularization strengths ($\lambda=0, 1, 10$, from left to right). When $\lambda=0$, the training reduces to standard optimization. The QAT distribution (leftmost) exhibits the characteristic threshold clustering behavior. As $\lambda$ increases, we observe progressively stronger clustering of weights around quantization thresholds, illustrating the relationship between regularization strength and weight clustering.
  • Figure 3: Distribution of weight oscillations. The plots show the distribution of weights with oscillation counts $>0$ when training with a) QAT and b)-d) the regularizer for different values of $\lambda$. Here $\lambda = 0$ corresponds to a full-precision model where the regularizer has no influence on training. The y-axis represents the percentage of total weights in the first convolutional layer of a ResNet-18 trained from scratch for 50 epochs, while the x-axis shows the oscillation count after 50 epochs. Following the oscillation definition from nagel2022overcoming, we count oscillations at each epoch during training. The results demonstrate that QAT produces a significantly higher proportion of oscillating weights compared to $\lambda=0$. Furthermore, we observe that as we increase $\lambda$ a greater percentage of weights oscillates.
  • Figure 4: We repeat the toy model experiments, but this time with two weights, taking into account that the linear term is no longer zero in the gradient. We notice at epoch 15 and 18 where the prediction of the quantized model is greater than y, the effect of the terms flip for $w_2$.
  • Figure 5: Mean over 3 runs of the best test accuracy for different lambdas. Fine-tuning a pretrained ResNet-18 on CIFAR-10 for 50 epochs. Quantizer is set to 3-bit and $10^{-3}$ learning rate and 100% of the training data is used.
  • ...and 5 more figures