Table of Contents
Fetching ...

Retinex-guided Histogram Transformer for Mask-free Shadow Removal

Wei Dong, Han Zhou, Seyed Amirreza Mousavi, Jun Chen

TL;DR

The paper tackles shadow removal without relying on shadow masks, addressing generalization gaps in mask-based approaches. It introduces ReHiT, a dual-branch Retinex-based pipeline with an Illumination-Guided Hybrid CNN-Transformer (IG-HCT) and an Illumination-Guided Histogram Transformer Block (IG-HTB) to handle non-uniform illumination. The approach jointly models reflectance and illumination, using DRDB and SAM blocks for multi-scale fusion and a histogram self-attention mechanism guided by illumination. Experiments on ISTD, ISTD+, WSRD+ and the NTIRE 2025 Shadow Removal Challenge show competitive performance with much smaller parameter counts and faster inference, highlighting practical efficiency for real-world deployment.

Abstract

While deep learning methods have achieved notable progress in shadow removal, many existing approaches rely on shadow masks that are difficult to obtain, limiting their generalization to real-world scenes. In this work, we propose ReHiT, an efficient mask-free shadow removal framework based on a hybrid CNN-Transformer architecture guided by Retinex theory. We first introduce a dual-branch pipeline to separately model reflectance and illumination components, and each is restored by our developed Illumination-Guided Hybrid CNN-Transformer (IG-HCT) module. Second, besides the CNN-based blocks that are capable of learning residual dense features and performing multi-scale semantic fusion, multi-scale semantic fusion, we develop the Illumination-Guided Histogram Transformer Block (IGHB) to effectively handle non-uniform illumination and spatially complex shadows. Extensive experiments on several benchmark datasets validate the effectiveness of our approach over existing mask-free methods. Trained solely on the NTIRE 2025 Shadow Removal Challenge dataset, our solution delivers competitive results with one of the smallest parameter sizes and fastest inference speeds among top-ranked entries, highlighting its applicability for real-world applications with limited computational resources. The code is available at https://github.com/dongw22/oath.

Retinex-guided Histogram Transformer for Mask-free Shadow Removal

TL;DR

The paper tackles shadow removal without relying on shadow masks, addressing generalization gaps in mask-based approaches. It introduces ReHiT, a dual-branch Retinex-based pipeline with an Illumination-Guided Hybrid CNN-Transformer (IG-HCT) and an Illumination-Guided Histogram Transformer Block (IG-HTB) to handle non-uniform illumination. The approach jointly models reflectance and illumination, using DRDB and SAM blocks for multi-scale fusion and a histogram self-attention mechanism guided by illumination. Experiments on ISTD, ISTD+, WSRD+ and the NTIRE 2025 Shadow Removal Challenge show competitive performance with much smaller parameter counts and faster inference, highlighting practical efficiency for real-world deployment.

Abstract

While deep learning methods have achieved notable progress in shadow removal, many existing approaches rely on shadow masks that are difficult to obtain, limiting their generalization to real-world scenes. In this work, we propose ReHiT, an efficient mask-free shadow removal framework based on a hybrid CNN-Transformer architecture guided by Retinex theory. We first introduce a dual-branch pipeline to separately model reflectance and illumination components, and each is restored by our developed Illumination-Guided Hybrid CNN-Transformer (IG-HCT) module. Second, besides the CNN-based blocks that are capable of learning residual dense features and performing multi-scale semantic fusion, multi-scale semantic fusion, we develop the Illumination-Guided Histogram Transformer Block (IGHB) to effectively handle non-uniform illumination and spatially complex shadows. Extensive experiments on several benchmark datasets validate the effectiveness of our approach over existing mask-free methods. Trained solely on the NTIRE 2025 Shadow Removal Challenge dataset, our solution delivers competitive results with one of the smallest parameter sizes and fastest inference speeds among top-ranked entries, highlighting its applicability for real-world applications with limited computational resources. The code is available at https://github.com/dongw22/oath.

Paper Structure

This paper contains 24 sections, 3 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Our method demonstrates effective shadow removal from low-quality images and achieves the 7th place in the fidelity track. Notably, it is trained exclusively on the NTIRE 2025 Shadow Removal Challenge training set, and among the top seven solutions, it features one of the smallest parameter counts and among the fastest inference speeds, which underscores its practical applicability under constrained computational resources.
  • Figure 2: The overall pipeline of our ReHiT. Our solution employs dual-branch Retinex-based pipeline, each branch is dedicated to the restoration of reflectance and illumination map, respectively. The Illumination-Guided Hybrid CNN-Transformers (IG-HCT) module is developed as the primary restoration network. Besides, the Illumination-Guided Histogram Transformer Block (IG-HTB) with the illumination-guided histogram self-attention is combined with the CNN-based Dilated Residual Dense Block (DRDB) DRDB and Semantic-aligned scale-aware module (SAM) DRDB to boost the performance on shadow removal.
  • Figure 3: Visual comparisons on the ISTD dataset stcgan. DCShadow preserves textures better but fails to completely eliminate shadows, resulting in visible residuals and boundary artifacts. Similar to ShdowRefiner, our method successfully remove shadows without incorporating artifacts, and produces more uniform and natural shadow removal results, effectively eliminating both soft and hard shadows while preserving underlying textures.
  • Figure 4: Visual comparisons on the ISTD+ dataset le2019shadow. These results indicate that DCShadow partially remove the shadows but leave noticeable residuals and illumination inconsistencies, especially along shadow boundaries. In contrast, both ShadowRefiner and our method more effectively eliminates shadows with minimal artifacts, producing results that are visually closer to the ground truth. Notably, we only leverage 5$\%$ parameters of ShadowRefiner to achieve comparable performance.
  • Figure 5: Our method delivers promising performance on the WSRD+ validation set vasluianu2023wsrd. Our method effectively removes both soft and hard shadows, while preserving fine structural and textural details. Notably, regions previously affected by shadows are relit with high fidelity and without introducing noticeable artifacts. The outputs demonstrate consistent illumination and seamless transitions between previously shadowed and non-shadowed areas, highlighting the robustness and generalization ability of our approach in handling real-world shadow removal.