Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration
Yujie Chen, Haotong Qin, Zhang Zhang, Michelo Magno, Luca Benini, Yawei Li
TL;DR
The paper tackles efficient deployment of state-space models for image restoration by introducing Q-MambaIR, which combines Dynamic-balancing Learnable Scalar (DLS) and Range-floating Flexible Allocator (RFA) to enable accurate, flexible ultra-low-bit quantization of SS2D-based Visual State Space Models. By adaptively adjusting activation ranges and using a soft, learnable rounding mechanism, Q-MambaIR mitigates outlier truncation and gradient mismatch, preserving high-frequency textures essential for high-quality IR. Experiments across classic and lightweight image super-resolution, Gaussian denoising, and JPEG artifact reduction demonstrate that 4-bit and 2-bit variants of Q-MambaIR consistently outperform existing quantized SSM baselines, often matching or approaching full-precision performance with substantial reductions in parameters and FLOPs. The approach offers practical advantages for edge devices, providing near-SOTA accuracy with greatly reduced memory and compute demands without significant training overhead.
Abstract
State-Space Models (SSMs) have attracted considerable attention in Image Restoration (IR) due to their ability to scale linearly sequence length while effectively capturing long-distance dependencies. However, deploying SSMs to edge devices is challenging due to the constraints in memory, computing capacity, and power consumption, underscoring the need for efficient compression strategies. While low-bit quantization is an efficient model compression strategy for reducing size and accelerating IR tasks, SSM suffers substantial performance drops at ultra-low bit-widths (2-4 bits), primarily due to outliers that exacerbate quantization error. To address this challenge, we propose Q-MambaIR, an accurate, efficient, and flexible Quantized Mamba for IR tasks. Specifically, we introduce a Statistical Dynamic-balancing Learnable Scalar (DLS) to dynamically adjust the quantization mapping range, thereby mitigating the peak truncation loss caused by extreme values. Furthermore, we design a Range-floating Flexible Allocator (RFA) with an adaptive threshold to flexibly round values. This approach preserves high-frequency details and maintains the SSM's feature extraction capability. Notably, RFA also enables pre-deployment weight quantization, striking a balance between computational efficiency and model accuracy. Extensive experiments on IR tasks demonstrate that Q-MambaIR consistently outperforms existing quantized SSMs, achieving much higher state-of-the-art (SOTA) accuracy results with only a negligible increase in training computation and storage saving.
