Table of Contents
Fetching ...

MPSI: Mamba enhancement model for pixel-wise sequential interaction Image Super-Resolution

Yuchun He, Yuhan He

TL;DR

The paper tackles the persistent challenge of capturing long-range pixel interactions in single-image super-resolution. It introduces MPSI, a Mamba-based framework that combines a Channel-Mamba Block (CMB) and a Mamba channel recursion module (MCRM) to model long sequence dependencies while preserving multi-layer feature information. The architecture integrates Spatial Transformer Blocks (STB) and CMB within Spatial Attention Mamba Groups (SAMG), augmented by the MCRM to retain shallow features and enhance pixel-wise sequential interaction. Experimental results show state-of-the-art performance on higher upscaling factors (x3, x4) across standard SR benchmarks, with notable gains in Urban100, and ablations validate the contributions of CMB and MCRM. The approach offers a scalable, lightweight pathway to improved image reconstruction by leveraging long-range pixel interactions and multi-level feature retention.

Abstract

Single image super-resolution (SR) has long posed a challenge in the field of computer vision. While the advent of deep learning has led to the emergence of numerous methods aimed at tackling this persistent issue, the current methodologies still encounter challenges in modeling long sequence information, leading to limitations in effectively capturing the global pixel interactions. To tackle this challenge and achieve superior SR outcomes, we propose the Mamba pixel-wise sequential interaction network (MPSI), aimed at enhancing the establishment of long-range connections of information, particularly focusing on pixel-wise sequential interaction. We propose the Channel-Mamba Block (CMB) to capture comprehensive pixel interaction information by effectively modeling long sequence information. Moreover, in the existing SR methodologies, there persists the issue of the neglect of features extracted by preceding layers, leading to the loss of valuable feature information. While certain existing models strive to preserve these features, they frequently encounter difficulty in establishing connections across all layers. To overcome this limitation, MPSI introduces the Mamba channel recursion module (MCRM), which maximizes the retention of valuable feature information from early layers, thereby facilitating the acquisition of pixel sequence interaction information from multiple-level layers. Through extensive experimentation, we demonstrate that MPSI outperforms existing super-resolution methods in terms of image reconstruction results, attaining state-of-the-art performance.

MPSI: Mamba enhancement model for pixel-wise sequential interaction Image Super-Resolution

TL;DR

The paper tackles the persistent challenge of capturing long-range pixel interactions in single-image super-resolution. It introduces MPSI, a Mamba-based framework that combines a Channel-Mamba Block (CMB) and a Mamba channel recursion module (MCRM) to model long sequence dependencies while preserving multi-layer feature information. The architecture integrates Spatial Transformer Blocks (STB) and CMB within Spatial Attention Mamba Groups (SAMG), augmented by the MCRM to retain shallow features and enhance pixel-wise sequential interaction. Experimental results show state-of-the-art performance on higher upscaling factors (x3, x4) across standard SR benchmarks, with notable gains in Urban100, and ablations validate the contributions of CMB and MCRM. The approach offers a scalable, lightweight pathway to improved image reconstruction by leveraging long-range pixel interactions and multi-level feature retention.

Abstract

Single image super-resolution (SR) has long posed a challenge in the field of computer vision. While the advent of deep learning has led to the emergence of numerous methods aimed at tackling this persistent issue, the current methodologies still encounter challenges in modeling long sequence information, leading to limitations in effectively capturing the global pixel interactions. To tackle this challenge and achieve superior SR outcomes, we propose the Mamba pixel-wise sequential interaction network (MPSI), aimed at enhancing the establishment of long-range connections of information, particularly focusing on pixel-wise sequential interaction. We propose the Channel-Mamba Block (CMB) to capture comprehensive pixel interaction information by effectively modeling long sequence information. Moreover, in the existing SR methodologies, there persists the issue of the neglect of features extracted by preceding layers, leading to the loss of valuable feature information. While certain existing models strive to preserve these features, they frequently encounter difficulty in establishing connections across all layers. To overcome this limitation, MPSI introduces the Mamba channel recursion module (MCRM), which maximizes the retention of valuable feature information from early layers, thereby facilitating the acquisition of pixel sequence interaction information from multiple-level layers. Through extensive experimentation, we demonstrate that MPSI outperforms existing super-resolution methods in terms of image reconstruction results, attaining state-of-the-art performance.

Paper Structure

This paper contains 21 sections, 8 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Architecture of MPSI net work and SAMG.
  • Figure 2: Components of SAMG. The STB and CMB are the components of SAMB. Figures b, c, and d show the structures of STB, CMB, and MCRM respectively.
  • Figure 3: Architecture of DDBM. The initial value of $H_{t-1}$ is a zero matrix. N is the number of features in the feature sequence.
  • Figure 4: Visual comparison for single image SR ($\times$4 upscaling). We chose several results to conduct a comparative analysis of the restoration of details by each model. The results of SwinIR, ELAN, and DAT are the lightweight version of the model.
  • Figure 5: Difference maps of ablation studies on the effect of CMB and MCRM.
  • ...and 1 more figures