First-order State Space Model for Lightweight Image Super-resolution
Yujie Zhu, Xinyi Zhang, Yekai Lu, Guang Yang, Faming Fang, Guixu Zhang
TL;DR
This paper tackles lightweight image super-resolution by enhancing a Mamba-based SSM backbone with a First-order State Space Model (FSSM). By deriving a discrete, first-order hold-based formulation and introducing higher-order approximations, the authors enable stronger token correlations without increasing parameters, and provide a cumulative error analysis showing favorable bounds relative to vanilla SSM. Extensive experiments on DIV2K/DF2K and five benchmark datasets demonstrate that FMambaIR with FSSM variants achieves state-of-the-art results among light-weight SR methods, improving PSNR/SSIM and visual quality. The work offers a practical, efficient backbone for vision tasks requiring long-range dependencies and paves the way for broader adoption of FSSM in related domains.
Abstract
State space models (SSMs), particularly Mamba, have shown promise in NLP tasks and are increasingly applied to vision tasks. However, most Mamba-based vision models focus on network architecture and scan paths, with little attention to the SSM module. In order to explore the potential of SSMs, we modified the calculation process of SSM without increasing the number of parameters to improve the performance on lightweight super-resolution tasks. In this paper, we introduce the First-order State Space Model (FSSM) to improve the original Mamba module, enhancing performance by incorporating token correlations. We apply a first-order hold condition in SSMs, derive the new discretized form, and analyzed cumulative error. Extensive experimental results demonstrate that FSSM improves the performance of MambaIR on five benchmark datasets without additionally increasing the number of parameters, and surpasses current lightweight SR methods, achieving state-of-the-art results.
