DynaQuant: Dynamic Mixed-Precision Quantization for Learned Image Compression
Youneng Bao, Yulong Cheng, Yiping Liu, Yichen Yang, Peng Qin, Mu Li, Yongsheng Liang
TL;DR
DynaQuant tackles the inefficiency of static bit-width in Learned Image Compression by introducing two intertwined dynamics: content-aware quantization and a data-driven dynamic bit-width selector. It employs per-channel learnable quantization parameters and a distance-aware gradient modulator to provide informative learning signals, while a differentiable bit-width selector assigns layer-wise bit-widths based on input statistics, jointly optimized under a rate-distortion objective. The end-to-end framework achieves RD performance close to full-precision models while delivering up to roughly $5\times$ speedups and substantially smaller model sizes, enabling practical LIC deployment on diverse hardware. This work advances LIC efficiency by bridging quantization theory with input- and layer-aware adaptations, offering a practical path toward real-time, resource-constrained image compression.
Abstract
Prevailing quantization techniques in Learned Image Compression (LIC) typically employ a static, uniform bit-width across all layers, failing to adapt to the highly diverse data distributions and sensitivity characteristics inherent in LIC models. This leads to a suboptimal trade-off between performance and efficiency. In this paper, we introduce DynaQuant, a novel framework for dynamic mixed-precision quantization that operates on two complementary levels. First, we propose content-aware quantization, where learnable scaling and offset parameters dynamically adapt to the statistical variations of latent features. This fine-grained adaptation is trained end-to-end using a novel Distance-aware Gradient Modulator (DGM), which provides a more informative learning signal than the standard Straight-Through Estimator. Second, we introduce a data-driven, dynamic bit-width selector that learns to assign an optimal bit precision to each layer, dynamically reconfiguring the network's precision profile based on the input data. Our fully dynamic approach offers substantial flexibility in balancing rate-distortion (R-D) performance and computational cost. Experiments demonstrate that DynaQuant achieves rd performance comparable to full-precision models while significantly reducing computational and storage requirements, thereby enabling the practical deployment of advanced LIC on diverse hardware platforms.
