Table of Contents
Fetching ...

DeshadowMamba: Deshadowing as 1D Sequential Similarity

Zhaotong Yang, Yi Chen, Yanying Li, Shengfeng He, Yangyang Xu, Junyu Dong, Jian Yang, Yong Du

TL;DR

DeshadowMamba reframes image shadow removal as a $1$D sequence modeling problem using the Mamba state-space model to capture global context with linear complexity while preserving spatial structure. It introduces CrossGate, a directional modulation that injects shadow-aware similarity into Mamba's input gate, and ColorShift regularization to enforce color fidelity via structured negatives. Through a two-stage training strategy and extensive experiments on SRD, ISTD+, and real-world SBU, DeshadowMamba achieves state-of-the-art visual quality and quantitative gains, demonstrating robust generalization to real scenes. The work advances shadow removal by combining efficient global context modeling with region-aware semantic guidance and color-consistent restoration.

Abstract

Recent deep models for image shadow removal often rely on attention-based architectures to capture long-range dependencies. However, their fixed attention patterns tend to mix illumination cues from irrelevant regions, leading to distorted structures and inconsistent colors. In this work, we revisit shadow removal from a sequence modeling perspective and explore the use of Mamba, a selective state space model that propagates global context through directional state transitions. These transitions yield an efficient global receptive field while preserving positional continuity. Despite its potential, directly applying Mamba to image data is suboptimal, since it lacks awareness of shadow-non-shadow semantics and remains susceptible to color interference from nearby regions. To address these limitations, we propose CrossGate, a directional modulation mechanism that injects shadow-aware similarity into Mamba's input gate, allowing selective integration of relevant context along transition axes. To further ensure appearance fidelity, we introduce ColorShift regularization, a contrastive learning objective driven by global color statistics. By synthesizing structured informative negatives, it guides the model to suppress color contamination and achieve robust color restoration. Together, these components adapt sequence modeling to the structural integrity and chromatic consistency required for shadow removal. Extensive experiments on public benchmarks demonstrate that DeshadowMamba achieves state-of-the-art visual quality and strong quantitative performance.

DeshadowMamba: Deshadowing as 1D Sequential Similarity

TL;DR

DeshadowMamba reframes image shadow removal as a D sequence modeling problem using the Mamba state-space model to capture global context with linear complexity while preserving spatial structure. It introduces CrossGate, a directional modulation that injects shadow-aware similarity into Mamba's input gate, and ColorShift regularization to enforce color fidelity via structured negatives. Through a two-stage training strategy and extensive experiments on SRD, ISTD+, and real-world SBU, DeshadowMamba achieves state-of-the-art visual quality and quantitative gains, demonstrating robust generalization to real scenes. The work advances shadow removal by combining efficient global context modeling with region-aware semantic guidance and color-consistent restoration.

Abstract

Recent deep models for image shadow removal often rely on attention-based architectures to capture long-range dependencies. However, their fixed attention patterns tend to mix illumination cues from irrelevant regions, leading to distorted structures and inconsistent colors. In this work, we revisit shadow removal from a sequence modeling perspective and explore the use of Mamba, a selective state space model that propagates global context through directional state transitions. These transitions yield an efficient global receptive field while preserving positional continuity. Despite its potential, directly applying Mamba to image data is suboptimal, since it lacks awareness of shadow-non-shadow semantics and remains susceptible to color interference from nearby regions. To address these limitations, we propose CrossGate, a directional modulation mechanism that injects shadow-aware similarity into Mamba's input gate, allowing selective integration of relevant context along transition axes. To further ensure appearance fidelity, we introduce ColorShift regularization, a contrastive learning objective driven by global color statistics. By synthesizing structured informative negatives, it guides the model to suppress color contamination and achieve robust color restoration. Together, these components adapt sequence modeling to the structural integrity and chromatic consistency required for shadow removal. Extensive experiments on public benchmarks demonstrate that DeshadowMamba achieves state-of-the-art visual quality and strong quantitative performance.

Paper Structure

This paper contains 26 sections, 18 equations, 27 figures, 6 tables.

Figures (27)

  • Figure 2: DeshadowMamba consists of a Mamba-based encoder-decoder architecture enhanced by CrossGate modulation and ColorShift regularization. CrossGate injects directional, shadow-aware similarity into Mamba's input gate to guide feature integration, while ColorShift generates weighted contrastive samples to enforce color consistency during training.
  • Figure 3: Visual comparisons with state-of-the-art methods on the SRD dataset. (Best viewed zoomed in.)
  • Figure 4: Visual comparisons with state-of-the-art methods on the ISTD+ dataset. (Best viewed zoomed in.)
  • Figure 6: Visual effects of CrossGate modulation on input gates (final Mamba block) and deshadowing results.
  • Figure : (a) Shadow Image
  • ...and 22 more figures