Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction

Yunshan Zhong; You Huang; Jiawei Hu; Yuxin Zhang; Rongrong Ji

Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction

Yunshan Zhong, You Huang, Jiawei Hu, Yuxin Zhang, Rongrong Ji

TL;DR

Vision Transformers face pronounced quantization errors under post-training quantization due to complex weight–activation interactions. ERQ addresses this with a two-stage strategy: Aqer reduces activation quantization error via Reparameterization Initialization and a Ridge Regression correction, followed by Wqer that employs Dual Uniform Quantization, Rounding Refinement, and another Ridge Regression to minimize weight quantization error in an iterative loop. Empirical results across ImageNet, COCO, and DIV2K show ERQ consistently outperforms state-of-the-art PTQ methods (notably GPTQ) with substantial gains at low bit-widths and favorable runtime, while preserving near-full-precision performance at higher bits. The approach is data-efficient, fast, and generalizes across ViT variants and downstream tasks, with available code for reproducibility.

Abstract

Post-training quantization (PTQ) for vision transformers (ViTs) has received increasing attention from both academic and industrial communities due to its minimal data needs and high time efficiency. However, many current methods fail to account for the complex interactions between quantized weights and activations, resulting in significant quantization errors and suboptimal performance. This paper presents ERQ, an innovative two-step PTQ method specifically crafted to reduce quantization errors arising from activation and weight quantization sequentially. The first step, Activation quantization error reduction (Aqer), first applies Reparameterization Initialization aimed at mitigating initial quantization errors in high-variance activations. Then, it further mitigates the errors by formulating a Ridge Regression problem, which updates the weights maintained at full-precision using a closed-form solution. The second step, Weight quantization error reduction (Wqer), first applies Dual Uniform Quantization to handle weights with numerous outliers, which arise from adjustments made during Reparameterization Initialization, thereby reducing initial weight quantization errors. Then, it employs an iterative approach to further tackle the errors. In each iteration, it adopts Rounding Refinement that uses an empirically derived, efficient proxy to refine the rounding directions of quantized weights, complemented by a Ridge Regression solver to reduce the errors. Comprehensive experimental results demonstrate ERQ's superior performance across various ViTs variants and tasks. For example, ERQ surpasses the state-of-the-art GPTQ by a notable 36.81% in accuracy for W3A4 ViT-S. Our codes are available at https://github.com/zysxmu/ERQ.

Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction

TL;DR

Abstract

Paper Structure (33 sections, 1 theorem, 25 equations, 11 figures, 16 tables, 2 algorithms)

This paper contains 33 sections, 1 theorem, 25 equations, 11 figures, 16 tables, 2 algorithms.

Introduction
Related Work
Vision Transformers (ViTs)
Post-training Quantization for ViTs
Preliminaries
Quantizers
Objective
Method
Activation Quantization Error Reduction
Reparameterization Initialization
Ridge Regression
Weight Quantization Error Reduction
Dual Uniform Quantization
Rounding Refinement
Ridge Regression
...and 18 more sections

Key Result

Proposition 1

Given the definition of outliers as specified in Eq. eq:identify-outlier, the greedy selection algorithm achieves the maximal coverage rate of outliers in $\mathbf{W}$.

Figures (11)

Figure 1: Framework of the proposed ERQ. ERQ consists of two steps to reduce the quantization errors from activation and weight quantization, respectively. The first step, Activation quantization error reduction (Aqer), includes Reparameterization Initialization and Ridge Regression. The second step, Weight quantization error reduction (Wqer), includes Dual Uniform Quantization, Rounding Refinement, and Ridge Regression.
Figure 2: Example of channel distribution of activations after block.8.norm2 of DeiT-S. Results are extracted with 32 images.
Figure 3: Comparison of MSE between using and not using Reparameterization Initialization. The MSE is evaluated by Eq. \ref{['eq:obj-act']}. "RepI" indicates Reparameterization Initialization. Results are derived from DeiT-S with 32 images. Activations are quantized to 4-bit.
Figure 4: Heatmap of absolute weight values: (a) Before and (b) After Reparameterization Initialization. Weights are extracted from blocks.8.mlp.fc1 of DeiT-S. For better visualization, elements with absolute values less than 0.2 have been set to zero.
Figure 5: Comparison of MSE between using and not using Dual Uniform Quantization. The MSE is evaluated by Eq. \ref{['eq:obj-weight0']}. "Dual" indicates Dual Uniform Quantization. Results are derived from W4A4 DeiT-S with 32 images.
...and 6 more figures

Theorems & Definitions (2)

Proposition 1
proof

Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction

TL;DR

Abstract

Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (2)