Table of Contents
Fetching ...

I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization

Yunshan Zhong, Jiawei Hu, Mingbao lin, Mengzhao Chen, Rongrong Ji

TL;DR

I&S-ViT tackles the challenge of post-training quantization for Vision Transformers by pinpointing two core issues: quantization inefficiency of log2 quantization for post-Softmax activations and rugged loss landscapes from coarse-grained quantization of post-LayerNorm activations. It resolves these with Shift-Uniform-Log2 Quantizer (SULQ) and a Three-stage Smooth Optimization Strategy (SOS), enabling inclusive domain coverage and stable learning through a staged transition from channel-wise to layer-wise quantization. Empirical results on ImageNet and COCO show substantial improvements at ultra-low bit-widths (e.g., 3-bit ViT-B improving by about 50.7 percentage points on ImageNet) and competitive runtime, establishing I&S-ViT as a strong, scalable PTQ baseline for ViTs. The work provides practical, hardware-friendly quantization tools and a principled optimization curriculum that reduces accuracy loss while preserving the efficiency advantages of PTQ for ViTs.

Abstract

Albeit the scalable performance of vision transformers (ViTs), the dense computational costs (training & inference) undermine their position in industrial applications. Post-training quantization (PTQ), tuning ViTs with a tiny dataset and running in a low-bit format, well addresses the cost issue but unluckily bears more performance drops in lower-bit cases. In this paper, we introduce I&S-ViT, a novel method that regulates the PTQ of ViTs in an inclusive and stable fashion. I&S-ViT first identifies two issues in the PTQ of ViTs: (1) Quantization inefficiency in the prevalent log2 quantizer for post-Softmax activations; (2) Rugged and magnified loss landscape in coarse-grained quantization granularity for post-LayerNorm activations. Then, I&S-ViT addresses these issues by introducing: (1) A novel shift-uniform-log2 quantizer (SULQ) that incorporates a shift mechanism followed by uniform quantization to achieve both an inclusive domain representation and accurate distribution approximation; (2) A three-stage smooth optimization strategy (SOS) that amalgamates the strengths of channel-wise and layer-wise quantization to enable stable learning. Comprehensive evaluations across diverse vision tasks validate I&S-ViT' superiority over existing PTQ of ViTs methods, particularly in low-bit scenarios. For instance, I&S-ViT elevates the performance of 3-bit ViT-B by an impressive 50.68%.

I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization

TL;DR

I&S-ViT tackles the challenge of post-training quantization for Vision Transformers by pinpointing two core issues: quantization inefficiency of log2 quantization for post-Softmax activations and rugged loss landscapes from coarse-grained quantization of post-LayerNorm activations. It resolves these with Shift-Uniform-Log2 Quantizer (SULQ) and a Three-stage Smooth Optimization Strategy (SOS), enabling inclusive domain coverage and stable learning through a staged transition from channel-wise to layer-wise quantization. Empirical results on ImageNet and COCO show substantial improvements at ultra-low bit-widths (e.g., 3-bit ViT-B improving by about 50.7 percentage points on ImageNet) and competitive runtime, establishing I&S-ViT as a strong, scalable PTQ baseline for ViTs. The work provides practical, hardware-friendly quantization tools and a principled optimization curriculum that reduces accuracy loss while preserving the efficiency advantages of PTQ for ViTs.

Abstract

Albeit the scalable performance of vision transformers (ViTs), the dense computational costs (training & inference) undermine their position in industrial applications. Post-training quantization (PTQ), tuning ViTs with a tiny dataset and running in a low-bit format, well addresses the cost issue but unluckily bears more performance drops in lower-bit cases. In this paper, we introduce I&S-ViT, a novel method that regulates the PTQ of ViTs in an inclusive and stable fashion. I&S-ViT first identifies two issues in the PTQ of ViTs: (1) Quantization inefficiency in the prevalent log2 quantizer for post-Softmax activations; (2) Rugged and magnified loss landscape in coarse-grained quantization granularity for post-LayerNorm activations. Then, I&S-ViT addresses these issues by introducing: (1) A novel shift-uniform-log2 quantizer (SULQ) that incorporates a shift mechanism followed by uniform quantization to achieve both an inclusive domain representation and accurate distribution approximation; (2) A three-stage smooth optimization strategy (SOS) that amalgamates the strengths of channel-wise and layer-wise quantization to enable stable learning. Comprehensive evaluations across diverse vision tasks validate I&S-ViT' superiority over existing PTQ of ViTs methods, particularly in low-bit scenarios. For instance, I&S-ViT elevates the performance of 3-bit ViT-B by an impressive 50.68%.
Paper Structure (19 sections, 14 equations, 5 figures, 4 tables)

This paper contains 19 sections, 14 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Illustration of (a) the quantization inefficiency issue of the 3/4-bit log2 quantizers. (b) the quantization process of 3/4-bit shift-uniform-log2 quantizers.
  • Figure 2: Illustration of the quantization function of 3-bit (a) log2 quantizer and uniform quantizer. (b) shift-uniform-log2 quantizer.
  • Figure 3: Loss landscapes for the 4-bit DeiT-S in transformer block 10. We perturb the weights along two basis vectors (Perturbation 1 & 2) to visualize the loss landscape. (a) Channel-wise weight quantization & layer-wise activation quantization. (b) Full-precision weights & layer-wise activation quantization. (c) Full-precision weights & channel-wise activation quantization.
  • Figure 4: The accuracy vs. runtime of PTQ methods on 3-bit DeiT.
  • Figure 5: The accuracy vs. image number on 3-bit DeiT.