Table of Contents
Fetching ...

MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis

Tianyu Wang, Jianming Zhang, Haitian Zheng, Zhihong Ding, Scott Cohen, Zhe Lin, Wei Xiong, Chi-Wing Fu, Luis Figueroa, Soo Ye Kim

TL;DR

MetaShadow tackles the realism gap in object-centered image editing by unifying shadow detection, removal, and controllable synthesis into a single framework. It combines a Shadow Analyzer (GAN-based detection/removal) with a Shadow Synthesizer (reference-based diffusion) and introduces shadow knowledge transfer to align features across tasks, achieving state-of-the-art results on three shadow-related tasks. The authors also construct the MOS dataset and two real-world evaluation sets to train and assess performance, demonstrating significant improvements in detection, removal, and synthesis, and enabling robust object relocation and insertion with coherent shadows. This approach advances object-centered editing by ensuring shadows are consistently modeled and manipulated alongside objects, reducing the need for explicit lighting or geometric parameters in practical workflows.

Abstract

Shadows are often under-considered or even ignored in image editing applications, limiting the realism of the edited results. In this paper, we introduce MetaShadow, a three-in-one versatile framework that enables detection, removal, and controllable synthesis of shadows in natural images in an object-centered fashion. MetaShadow combines the strengths of two cooperative components: Shadow Analyzer, for object-centered shadow detection and removal, and Shadow Synthesizer, for reference-based controllable shadow synthesis. Notably, we optimize the learning of the intermediate features from Shadow Analyzer to guide Shadow Synthesizer to generate more realistic shadows that blend seamlessly with the scene. Extensive evaluations on multiple shadow benchmark datasets show significant improvements of MetaShadow over the existing state-of-the-art methods on object-centered shadow detection, removal, and synthesis. MetaShadow excels in image-editing tasks such as object removal, relocation, and insertion, pushing the boundaries of object-centered image editing.

MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis

TL;DR

MetaShadow tackles the realism gap in object-centered image editing by unifying shadow detection, removal, and controllable synthesis into a single framework. It combines a Shadow Analyzer (GAN-based detection/removal) with a Shadow Synthesizer (reference-based diffusion) and introduces shadow knowledge transfer to align features across tasks, achieving state-of-the-art results on three shadow-related tasks. The authors also construct the MOS dataset and two real-world evaluation sets to train and assess performance, demonstrating significant improvements in detection, removal, and synthesis, and enabling robust object relocation and insertion with coherent shadows. This approach advances object-centered editing by ensuring shadows are consistently modeled and manipulated alongside objects, reducing the need for explicit lighting or geometric parameters in practical workflows.

Abstract

Shadows are often under-considered or even ignored in image editing applications, limiting the realism of the edited results. In this paper, we introduce MetaShadow, a three-in-one versatile framework that enables detection, removal, and controllable synthesis of shadows in natural images in an object-centered fashion. MetaShadow combines the strengths of two cooperative components: Shadow Analyzer, for object-centered shadow detection and removal, and Shadow Synthesizer, for reference-based controllable shadow synthesis. Notably, we optimize the learning of the intermediate features from Shadow Analyzer to guide Shadow Synthesizer to generate more realistic shadows that blend seamlessly with the scene. Extensive evaluations on multiple shadow benchmark datasets show significant improvements of MetaShadow over the existing state-of-the-art methods on object-centered shadow detection, removal, and synthesis. MetaShadow excels in image-editing tasks such as object removal, relocation, and insertion, pushing the boundaries of object-centered image editing.

Paper Structure

This paper contains 16 sections, 1 equation, 7 figures, 5 tables.

Figures (7)

  • Figure 1: We construct one training set, i.e., MOS Dataset, and two real-world evaluation sets, i.e., Moving DESOBA Dataset (without ground truths) and Video DESOBA Dataset (with ground truths), to train and evaluate the effectiveness of MetaShadow.
  • Figure 2: The schematic illustration of our MetaShadow framework. In Stage I, the Shadow Analyzer takes the input image with object mask (left player) to perform object-centered shadow detection and removal. After that, the selected player, together with the detected shadow region, will be moved to a new location. Our Stage II then takes these as input and synthesizes a shadow for this object. To achieve realistic shadow synthesis, we transfer the shadow knowledge extracted from the Shadow Analyzer to Shadow Synthesizer as reference. Note that $\boldmath{s}$ represents the global style code, $\boldmath{w}$ denotes the intermediate latent space, and "K Q V" stand for key, query, and value in UNet's cross attention layer.
  • Figure 3: Respective limitations of GAN-based and diffusion-based methods on shadow synthesis. For more discussion, please see Sec. \ref{['sec:ablation']}.
  • Figure 4: Visual comparison for object-centered shadow detection and removal tasks on the DESOBA test set. $\dagger$ means fine-tuned on our multi-source dataset strategy. Zoom in to see the details. For more results, please refer to the supplementary materials.
  • Figure 5: Visual comparison for object-centered shadow synthesis on the DESOBA test set hong2022shadow, our Moving DESOBA test set, and Video DESOBA. Zoom in to see the details. For more results, please refer to the supplementary materials.
  • ...and 2 more figures