MetaMix: Meta-state Precision Searcher for Mixed-precision Activation Quantization
Han-Byul Kim, Joo Hyung Lee, Sungjoo Yoo, Hong-Seok Kim
TL;DR
MetaMix addresses activation instability encountered during mixed-precision activation quantization by introducing a meta-state precision searcher that couples bit-width exploration with weight training. The method alternates between bit-meta training, which builds a stable meta-state across multiple bit-width branches, and bit-search training, which learns per-layer bit-width probabilities on a fixed meta-state, followed by a weight-fine-tuning phase. Empirically, MetaMix achieves state-of-the-art accuracy-cost trade-offs on ImageNet for MobileNet-v2, MobileNet-v3, and ResNet-18, outperforming both mixed- and single-precision SOTA methods with faster bit-width search than NAS-based approaches. The approach stabilizes activation statistics, reduces training instability, and offers a practical path to efficient, high-accuracy quantized networks for edge and mobile deployments.
Abstract
Mixed-precision quantization of efficient networks often suffer from activation instability encountered in the exploration of bit selections. To address this problem, we propose a novel method called MetaMix which consists of bit selection and weight training phases. The bit selection phase iterates two steps, (1) the mixed-precision-aware weight update, and (2) the bit-search training with the fixed mixed-precision-aware weights, both of which combined reduce activation instability in mixed-precision quantization and contribute to fast and high-quality bit selection. The weight training phase exploits the weights and step sizes trained in the bit selection phase and fine-tunes them thereby offering fast training. Our experiments with efficient and hard-to-quantize networks, i.e., MobileNet v2 and v3, and ResNet-18 on ImageNet show that our proposed method pushes the boundary of mixed-precision quantization, in terms of accuracy vs. operations, by outperforming both mixed- and single-precision SOTA methods.
