Table of Contents
Fetching ...

$\text{S}^{3}$Mamba: Arbitrary-Scale Super-Resolution via Scaleable State Space Model

Peizhe Xia, Long Peng, Xin Di, Renjing Pei, Yang Wang, Yang Cao, Zheng-Jun Zha

TL;DR

A Scalable State Space Model (SSSM) is proposed to modulate the state transition matrix and the sampling matrix of step size during the discretization process, achieving scalable and continuous representation modeling with linear computational complexity.

Abstract

Arbitrary scale super-resolution (ASSR) aims to super-resolve low-resolution images to high-resolution images at any scale using a single model, addressing the limitations of traditional super-resolution methods that are restricted to fixed-scale factors (e.g., $\times2$, $\times4$). The advent of Implicit Neural Representations (INR) has brought forth a plethora of novel methodologies for ASSR, which facilitate the reconstruction of original continuous signals by modeling a continuous representation space for coordinates and pixel values, thereby enabling arbitrary-scale super-resolution. Consequently, the primary objective of ASSR is to construct a continuous representation space derived from low-resolution inputs. However, existing methods, primarily based on CNNs and Transformers, face significant challenges such as high computational complexity and inadequate modeling of long-range dependencies, which hinder their effectiveness in real-world applications. To overcome these limitations, we propose a novel arbitrary-scale super-resolution method, called $\text{S}^{3}$Mamba, to construct a scalable continuous representation space. Specifically, we propose a Scalable State Space Model (SSSM) to modulate the state transition matrix and the sampling matrix of step size during the discretization process, achieving scalable and continuous representation modeling with linear computational complexity. Additionally, we propose a novel scale-aware self-attention mechanism to further enhance the network's ability to perceive global important features at different scales, thereby building the $\text{S}^{3}$Mamba to achieve superior arbitrary-scale super-resolution. Extensive experiments on both synthetic and real-world benchmarks demonstrate that our method achieves state-of-the-art performance and superior generalization capabilities at arbitrary super-resolution scales.

$\text{S}^{3}$Mamba: Arbitrary-Scale Super-Resolution via Scaleable State Space Model

TL;DR

A Scalable State Space Model (SSSM) is proposed to modulate the state transition matrix and the sampling matrix of step size during the discretization process, achieving scalable and continuous representation modeling with linear computational complexity.

Abstract

Arbitrary scale super-resolution (ASSR) aims to super-resolve low-resolution images to high-resolution images at any scale using a single model, addressing the limitations of traditional super-resolution methods that are restricted to fixed-scale factors (e.g., , ). The advent of Implicit Neural Representations (INR) has brought forth a plethora of novel methodologies for ASSR, which facilitate the reconstruction of original continuous signals by modeling a continuous representation space for coordinates and pixel values, thereby enabling arbitrary-scale super-resolution. Consequently, the primary objective of ASSR is to construct a continuous representation space derived from low-resolution inputs. However, existing methods, primarily based on CNNs and Transformers, face significant challenges such as high computational complexity and inadequate modeling of long-range dependencies, which hinder their effectiveness in real-world applications. To overcome these limitations, we propose a novel arbitrary-scale super-resolution method, called Mamba, to construct a scalable continuous representation space. Specifically, we propose a Scalable State Space Model (SSSM) to modulate the state transition matrix and the sampling matrix of step size during the discretization process, achieving scalable and continuous representation modeling with linear computational complexity. Additionally, we propose a novel scale-aware self-attention mechanism to further enhance the network's ability to perceive global important features at different scales, thereby building the Mamba to achieve superior arbitrary-scale super-resolution. Extensive experiments on both synthetic and real-world benchmarks demonstrate that our method achieves state-of-the-art performance and superior generalization capabilities at arbitrary super-resolution scales.

Paper Structure

This paper contains 13 sections, 11 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: (a) During real-world imaging, the continuous 3D physical world is discretized into an image through cameras and ISPs, resulting in an LR image due to sensor resolution. (c) Existing MLP-based INR methods often use point-to-point learning, making them susceptible to degradation such as noise. Additionally, the limited receptive field of MLPs cannot construct a perfect continuous space, as shown in (b). In contrast, our method (d) leverages scalable SSM to better capture global historical information and, through scalable training, reconstructs a continuous space more effectively, achieving superior arbitrary-scale super-resolution.
  • Figure 2: (a) Illustration of the proposed $\text{S}^3$Mamba framework. (b) The SSSM Block consists of the SSSM, along with multiple instance normalization layers, depthwise convolution (DWConv), and projection layers. (c) The Scalable State Space Model (SSSM) is proposed to modulate the state transition matrix and the sampling matrix of step size during the discretization process, achieving scalable and continuous representation modeling with linear computational complexity.
  • Figure 3: Visual comparison with existing methods on the real COZ dataset $\times$ 3. Please zoom in for a better view.
  • Figure 4: Visual comparison with existing methods on the DIV2K dataset $\times$ 4. Please zoom in for a better view.