Up or Down? Adaptive Rounding for Post-Training Quantization

Markus Nagel; Rana Ali Amjad; Mart van Baalen; Christos Louizos; Tijmen Blankevoort

Up or Down? Adaptive Rounding for Post-Training Quantization

Markus Nagel, Rana Ali Amjad, Mart van Baalen, Christos Louizos, Tijmen Blankevoort

TL;DR

This work tackles the limitations of rounding-to-nearest in post-training weight quantization by deriving a Taylor-series-based framework that treats per-layer weight rounding as a QUBO problem. To make it practical, the authors approximate the Hessian as diagonal, decompose the problem into layer-wise local losses, and solve via a continuous relaxation (AdaRound) with a rectified sigmoid and regularization, including asymmetric reconstruction to account for activation quantization. The method is data-efficient, requiring only unlabeled samples, and dramatically improves accuracy on 4-bit weight quantization across networks like ResNet-18/50, InceptionV3, MobilenetV2, and DeeplabV3+, often matching or exceeding FP32 performance without fine-tuning. Empirically, AdaRound outperforms bias correction and other PTQ methods on ImageNet and semantic segmentation benchmarks, establishing a new state-of-the-art in post-training weight quantization with strong robustness to data size and domain shifts. Overall, AdaRound offers a principled, scalable solution for deploying ultra-low-bit quantized networks on diverse hardware without re-training.

Abstract

When quantizing neural networks, assigning each floating-point weight to its nearest fixed-point value is the predominant approach. We find that, perhaps surprisingly, this is not the best we can do. In this paper, we propose AdaRound, a better weight-rounding mechanism for post-training quantization that adapts to the data and the task loss. AdaRound is fast, does not require fine-tuning of the network, and only uses a small amount of unlabelled data. We start by theoretically analyzing the rounding problem for a pre-trained neural network. By approximating the task loss with a Taylor series expansion, the rounding task is posed as a quadratic unconstrained binary optimization problem. We simplify this to a layer-wise local loss and propose to optimize this loss with a soft relaxation. AdaRound not only outperforms rounding-to-nearest by a significant margin but also establishes a new state-of-the-art for post-training quantization on several networks and tasks. Without fine-tuning, we can quantize the weights of Resnet18 and Resnet50 to 4 bits while staying within an accuracy loss of 1%.

Up or Down? Adaptive Rounding for Post-Training Quantization

TL;DR

Abstract

Up or Down? Adaptive Rounding for Post-Training Quantization

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)

Theorems & Definitions (1)