Table of Contents
Fetching ...

MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs

Yulin Ren, Xin Li, Mengxi Guo, Bingchen Li, Shijie Zhao, Zhibo Chen

TL;DR

An efficient dual-interleaved scanning paradigm (DIS) for CSR is proposed, which is composed of two scanning strategies: hierarchical interleaved scanning is designed to comprehensively capture and utilize the most potential contextual information within an image by simultaneously taking advantage of the local window-based and sequential scanning methods.

Abstract

We present MambaCSR, a simple but effective framework based on Mamba for the challenging compressed image super-resolution (CSR) task. Particularly, the scanning strategies of Mamba are crucial for effective contextual knowledge modeling in the restoration process despite it relying on selective state space modeling for all tokens. In this work, we propose an efficient dual-interleaved scanning paradigm (DIS) for CSR, which is composed of two scanning strategies: (i) hierarchical interleaved scanning is designed to comprehensively capture and utilize the most potential contextual information within an image by simultaneously taking advantage of the local window-based and sequential scanning methods; (ii) horizontal-to-vertical interleaved scanning is proposed to reduce the computational cost by leaving the redundancy between the scanning of different directions. To overcome the non-uniform compression artifacts, we also propose position-aligned cross-scale scanning to model multi-scale contextual information. Experimental results on multiple benchmarks have shown the great performance of our MambaCSR in the compressed image super-resolution task. The code will be soon available in~\textcolor{magenta}{\url{https://github.com/renyulin-f/MambaCSR}}.

MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs

TL;DR

An efficient dual-interleaved scanning paradigm (DIS) for CSR is proposed, which is composed of two scanning strategies: hierarchical interleaved scanning is designed to comprehensively capture and utilize the most potential contextual information within an image by simultaneously taking advantage of the local window-based and sequential scanning methods.

Abstract

We present MambaCSR, a simple but effective framework based on Mamba for the challenging compressed image super-resolution (CSR) task. Particularly, the scanning strategies of Mamba are crucial for effective contextual knowledge modeling in the restoration process despite it relying on selective state space modeling for all tokens. In this work, we propose an efficient dual-interleaved scanning paradigm (DIS) for CSR, which is composed of two scanning strategies: (i) hierarchical interleaved scanning is designed to comprehensively capture and utilize the most potential contextual information within an image by simultaneously taking advantage of the local window-based and sequential scanning methods; (ii) horizontal-to-vertical interleaved scanning is proposed to reduce the computational cost by leaving the redundancy between the scanning of different directions. To overcome the non-uniform compression artifacts, we also propose position-aligned cross-scale scanning to model multi-scale contextual information. Experimental results on multiple benchmarks have shown the great performance of our MambaCSR in the compressed image super-resolution task. The code will be soon available in~\textcolor{magenta}{\url{https://github.com/renyulin-f/MambaCSR}}.
Paper Structure (20 sections, 9 equations, 6 figures, 5 tables)

This paper contains 20 sections, 9 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Comparisons of existing scanning methods. (a) Sequential scanning used in most low-level tasks guo2024mambairshi2024vmambairshi2024multi-scale-mamba. (b) Local window-based scanning in some high-level tasks huang2024localmamba
  • Figure 2: The Overall Architecture of MambaCSR. The overall pipeline (top) consists of three components: shallow feature extraction, deep feature extraction, and high-resolution reconstruction. (b) illustrates our proposed dual-interleaved scanning method. (c) represents the process of cross-scale scanning for fusing multi-scale features.
  • Figure 3: Visual comparison of our MambaCSR model with other state-of-the-art models on the Urban100 huang2015single-urban100 dataset under the JPEG codec at compression levels QF = [10, 20, 30]. Additional visual comparisons for HEVC and VVC codecs could be found in Supplementary.
  • Figure 4: The receptive fields of different scanning methods: (a) Sequential scanning adopted in MambaIR, (b) Our hierarchical interleaved scanning, (c) Only local window-based scanning. All results are tested on the Manga109 matsui2017sketch-manga109 dataset.
  • Figure 5: Visual Comparisons tested on Urban100 huang2015single-urban100 for HEVC codec sze2014high-HEVC at x4 scale of QP=[32, 37, 42].
  • ...and 1 more figures