Table of Contents
Fetching ...

ShadowRefiner: Towards Mask-free Shadow Removal via Fast Fourier Transformer

Wei Dong, Han Zhou, Yuqiong Tian, Jingke Sun, Xiaohong Liu, Guangtao Zhai, Jun Chen

TL;DR

ShadowRefiner targets mask-free shadow removal by combining a ConvNext-based Shadow Removal module that leverages spatial and frequency representations with a novel Fast Fourier Attention Transformer (FFAT) in a Refinement module to enhance texture and color consistency. The two-stage architecture first removes shadows and then refines details via FFT-based attention, achieving state-of-the-art results on NTIRE 2024 (Perceptual Track winner, Fidelity Track runner-up) and strong performance on ISTD/ISTD+/WSRD+ datasets. Ablation confirms the FFAT refinement's crucial role in improving fidelity and texture while preserving colors, with extensive experiments supporting effectiveness over mask-free baselines and competitive performance against mask-based methods. The work demonstrates the value of integrating spatial-frequency analysis and FFT-domain attention in shadow removal, offering practical impact for real-world scenes and downstream vision tasks.

Abstract

Shadow-affected images often exhibit pronounced spatial discrepancies in color and illumination, consequently degrading various vision applications including object detection and segmentation systems. To effectively eliminate shadows in real-world images while preserving intricate details and producing visually compelling outcomes, we introduce a mask-free Shadow Removal and Refinement network (ShadowRefiner) via Fast Fourier Transformer. Specifically, the Shadow Removal module in our method aims to establish effective mappings between shadow-affected and shadow-free images via spatial and frequency representation learning. To mitigate the pixel misalignment and further improve the image quality, we propose a novel Fast-Fourier Attention based Transformer (FFAT) architecture, where an innovative attention mechanism is designed for meticulous refinement. Our method wins the championship in the Perceptual Track and achieves the second best performance in the Fidelity Track of NTIRE 2024 Image Shadow Removal Challenge. Besides, comprehensive experiment result also demonstrate the compelling effectiveness of our proposed method. The code is publicly available: https://github.com/movingforward100/Shadow_R.

ShadowRefiner: Towards Mask-free Shadow Removal via Fast Fourier Transformer

TL;DR

ShadowRefiner targets mask-free shadow removal by combining a ConvNext-based Shadow Removal module that leverages spatial and frequency representations with a novel Fast Fourier Attention Transformer (FFAT) in a Refinement module to enhance texture and color consistency. The two-stage architecture first removes shadows and then refines details via FFT-based attention, achieving state-of-the-art results on NTIRE 2024 (Perceptual Track winner, Fidelity Track runner-up) and strong performance on ISTD/ISTD+/WSRD+ datasets. Ablation confirms the FFAT refinement's crucial role in improving fidelity and texture while preserving colors, with extensive experiments supporting effectiveness over mask-free baselines and competitive performance against mask-based methods. The work demonstrates the value of integrating spatial-frequency analysis and FFT-domain attention in shadow removal, offering practical impact for real-world scenes and downstream vision tasks.

Abstract

Shadow-affected images often exhibit pronounced spatial discrepancies in color and illumination, consequently degrading various vision applications including object detection and segmentation systems. To effectively eliminate shadows in real-world images while preserving intricate details and producing visually compelling outcomes, we introduce a mask-free Shadow Removal and Refinement network (ShadowRefiner) via Fast Fourier Transformer. Specifically, the Shadow Removal module in our method aims to establish effective mappings between shadow-affected and shadow-free images via spatial and frequency representation learning. To mitigate the pixel misalignment and further improve the image quality, we propose a novel Fast-Fourier Attention based Transformer (FFAT) architecture, where an innovative attention mechanism is designed for meticulous refinement. Our method wins the championship in the Perceptual Track and achieves the second best performance in the Fidelity Track of NTIRE 2024 Image Shadow Removal Challenge. Besides, comprehensive experiment result also demonstrate the compelling effectiveness of our proposed method. The code is publicly available: https://github.com/movingforward100/Shadow_R.
Paper Structure (11 sections, 3 equations, 6 figures, 3 tables)

This paper contains 11 sections, 3 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Our test results on NTIRE 2024 Image Shadow Removal Challenge vasluianu2024ntire_isr. Our proposed method is the champion in the Perceptual Track and achieves the second best performance in the Fidelity Track.
  • Figure 2: The overall architecture of our model. In the Shadow Removal module, besides the DWT-FFC branch proposed in DWT-FFC_2023_CVPRWancuti2023ntire, we design a ConvNext-based U-Net architecture with $7\times7$ depth-wise convolution in each resolution. In the Refinement module, we design a new attention mechanism (Fast-Foruier Attention, FFA) different from common attention operations in transformers to further enhance texture details.
  • Figure 3: Visual comparisons on the ISTD dataset stcgan. Compared to other methods, our ShadowRefiner successfully remove shadows without incorporating artifacts.
  • Figure 4: Visual comparisons on the ISTD+ datasetle2019shadow. Obviously, our ShadowRefiner excels in maintaining color consistency and recovering structure details
  • Figure 5: Our results on the validation set of WSRD+ dataset vasluianu2023wsrd. Our method performs well on color and detail recovery.
  • ...and 1 more figures