MambaIRv2: Attentive State Space Restoration

Hang Guo; Yong Guo; Yaohua Zha; Yulun Zhang; Wenbo Li; Tao Dai; Shu-Tao Xia; Yawei Li

MambaIRv2: Attentive State Space Restoration

Hang Guo, Yong Guo, Yaohua Zha, Yulun Zhang, Wenbo Li, Tao Dai, Shu-Tao Xia, Yawei Li

TL;DR

MambaIRv2 tackles the inherent causality of Mamba-based image restoration by introducing an Attentive State-space Equation (ASE) and Semantic Guided Neighboring (SGN) to enable ViT-like non-causal information flow within a state-space framework. ASE injects semantically similar pixel prompts into the state-space output to query the entire image, while SGN reshapes the 1D sequence so semantically related pixels are proximal, mitigating long-range decay. The resulting Attentive State Space Restoration backbone demonstrates superior performance and efficiency across lightweight and classic SR, JPEG CAR, and denoising tasks, outperforming strong Transformer-based baselines with fewer parameters and lower compute. This approach provides a principled, single-pass alternative to multi-directional scanning, with evidence of broader receptive fields and improved restoration quality. The work suggests a promising direction for integrating ViT-like attention into Mamba for high-quality, efficient low-level vision tasks.

Abstract

The Mamba-based image restoration backbones have recently demonstrated significant potential in balancing global reception and computational efficiency. However, the inherent causal modeling limitation of Mamba, where each token depends solely on its predecessors in the scanned sequence, restricts the full utilization of pixels across the image and thus presents new challenges in image restoration. In this work, we propose MambaIRv2, which equips Mamba with the non-causal modeling ability similar to ViTs to reach the attentive state space restoration model. Specifically, the proposed attentive state-space equation allows to attend beyond the scanned sequence and facilitate image unfolding with just one single scan. Moreover, we further introduce a semantic-guided neighboring mechanism to encourage interaction between distant but similar pixels. Extensive experiments show our MambaIRv2 outperforms SRFormer by even 0.35dB PSNR for lightweight SR even with 9.3\% less parameters and suppresses HAT on classic SR by up to 0.29dB. Code is available at https://github.com/csguoh/MambaIR.

MambaIRv2: Attentive State Space Restoration

TL;DR

Abstract

MambaIRv2: Attentive State Space Restoration

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)