Latent Feature-Guided Diffusion Models for Shadow Removal
Kangfu Mei, Luis Figueroa, Zhe Lin, Zhihong Ding, Scott Cohen, Vishal M. Patel
TL;DR
This work tackles shadow removal by reformulating it as a diffusion-model restoration problem conditioned on shadows. It introduces a learnable latent feature space that captures shadow-free priors and a two-stage training strategy, combined with a Dense Latent Variable Fusion module to prevent local optima and boost texture fidelity. Empirical results on AISTD, ISTD, SRD, and DESOBA demonstrate state-of-the-art performance, including significant gains for instance-level shadow removal. The approach offers a principled way to guide diffusion models with perceptual priors and suggests broader applicability to other ill-posed low-level vision tasks.
Abstract
Recovering textures under shadows has remained a challenging problem due to the difficulty of inferring shadow-free scenes from shadow images. In this paper, we propose the use of diffusion models as they offer a promising approach to gradually refine the details of shadow regions during the diffusion process. Our method improves this process by conditioning on a learned latent feature space that inherits the characteristics of shadow-free images, thus avoiding the limitation of conventional methods that condition on degraded images only. Additionally, we propose to alleviate potential local optima during training by fusing noise features with the diffusion network. We demonstrate the effectiveness of our approach which outperforms the previous best method by 13% in terms of RMSE on the AISTD dataset. Further, we explore instance-level shadow removal, where our model outperforms the previous best method by 82% in terms of RMSE on the DESOBA dataset.
