Outlier-Aware Training for Low-Bit Quantization of Structural Re-Parameterized Networks
Muqun Niu, Yuan Ren, Boyu Li, Chenchen Ding
TL;DR
The paper tackles the quantization difficulty of structural re-parameterized networks, specifically RepVGG, caused by weight outliers arising during the training-to-inference merging process. It introduces Outlier-Aware Batch Normalization (OABN) to suppress these outliers during training and ClusterQAT, a clustering-based Quantization-Aware Training method, to preserve weight distributions under low-bit quantization. Experimental results on CIFAR-10 and ImageNet show that combining OABN with ClusterQAT significantly improves 8-bit and lower-bit performance, making RepVGG viable under tight memory and state constraints. The proposed approach offers a practical, training-time solution to enable edge-ready quantization of SR networks without substantial architecture changes.
Abstract
Lightweight design of Convolutional Neural Networks (CNNs) requires co-design efforts in the model architectures and compression techniques. As a novel design paradigm that separates training and inference, a structural re-parameterized (SR) network such as the representative RepVGG revitalizes the simple VGG-like network with a high accuracy comparable to advanced and often more complicated networks. However, the merging process in SR networks introduces outliers into weights, making their distribution distinct from conventional networks and thus heightening difficulties in quantization. To address this, we propose an operator-level improvement for training called Outlier Aware Batch Normalization (OABN). Additionally, to meet the demands of limited bitwidths while upkeeping the inference accuracy, we develop a clustering-based non-uniform quantization framework for Quantization-Aware Training (QAT) named ClusterQAT. Integrating OABN with ClusterQAT, the quantized performance of RepVGG is largely enhanced, particularly when the bitwidth falls below 8.
