Table of Contents
Fetching ...

QGen: On the Ability to Generalize in Quantization Aware Training

MohammadHossein AskariHemmat, Ahmadreza Jeddi, Reyhane Askari Hemmat, Ivan Lazarevich, Alexander Hoffman, Sudhakar Sah, Ehsan Saboori, Yvon Savaria, Jean-Pierre David

TL;DR

The paper addresses how quantization in neural networks affects generalization, beyond accuracy, by modeling quantization as additive noise and deriving a regularization term that scales with quantization width. It shows that quantization can drive training toward flatter minima, supported by magnitude-aware sharpness and PAC-Bayes metrics, and provides an approximate bound on generalization gap conditioned on the noise level. Extensive experiments across CIFAR-10/100 and ImageNet on CNNs and Vision Transformers (nearly 2000 models) demonstrate that quantized models often have smaller generalization gaps, and 8-bit quantization closely matches full precision in generalization while lower bits may improve regularization at the cost of training loss. The results suggest quantization-aware training can improve robustness to distorted data and enable efficient deployment without substantial sacrifices in generalization, guiding hyperparameter tuning for quantization levels in practical settings.

Abstract

Quantization lowers memory usage, computational requirements, and latency by utilizing fewer bits to represent model weights and activations. In this work, we investigate the generalization properties of quantized neural networks, a characteristic that has received little attention despite its implications on model performance. In particular, first, we develop a theoretical model for quantization in neural networks and demonstrate how quantization functions as a form of regularization. Second, motivated by recent work connecting the sharpness of the loss landscape and generalization, we derive an approximate bound for the generalization of quantized models conditioned on the amount of quantization noise. We then validate our hypothesis by experimenting with over 2000 models trained on CIFAR-10, CIFAR-100, and ImageNet datasets on convolutional and transformer-based models.

QGen: On the Ability to Generalize in Quantization Aware Training

TL;DR

The paper addresses how quantization in neural networks affects generalization, beyond accuracy, by modeling quantization as additive noise and deriving a regularization term that scales with quantization width. It shows that quantization can drive training toward flatter minima, supported by magnitude-aware sharpness and PAC-Bayes metrics, and provides an approximate bound on generalization gap conditioned on the noise level. Extensive experiments across CIFAR-10/100 and ImageNet on CNNs and Vision Transformers (nearly 2000 models) demonstrate that quantized models often have smaller generalization gaps, and 8-bit quantization closely matches full precision in generalization while lower bits may improve regularization at the cost of training loss. The results suggest quantization-aware training can improve robustness to distorted data and enable efficient deployment without substantial sacrifices in generalization, guiding hyperparameter tuning for quantization levels in practical settings.

Abstract

Quantization lowers memory usage, computational requirements, and latency by utilizing fewer bits to represent model weights and activations. In this work, we investigate the generalization properties of quantized neural networks, a characteristic that has received little attention despite its implications on model performance. In particular, first, we develop a theoretical model for quantization in neural networks and demonstrate how quantization functions as a form of regularization. Second, motivated by recent work connecting the sharpness of the loss landscape and generalization, we derive an approximate bound for the generalization of quantized models conditioned on the amount of quantization noise. We then validate our hypothesis by experimenting with over 2000 models trained on CIFAR-10, CIFAR-100, and ImageNet datasets on convolutional and transformer-based models.
Paper Structure (24 sections, 17 equations, 2 figures, 6 tables)

This paper contains 24 sections, 17 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Width of the quantization bin for the weight tensor of each layer in ResNet-50, ResNet-18, and MobileNet V2 trained on the ImageNet dataset. We used different quantization levels and in all cases, the induced quantization noise is significantly higher when a lower bit resolution value is used.
  • Figure 2: Visualization of the loss landscape for the full precision and quantized ResNet-18 models trained on CIFAR-10. Referring to Table \ref{['Tab:flatness_measures']}, it is observed that quantized models possess flatter minima, which contributes to their enhanced generalization capabilities.