Table of Contents
Fetching ...

S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal

Nikolina Kubiak, Armin Mustafa, Graeme Phillipson, Stephen Jolly, Simon Hadfield

TL;DR

S3R-Net tackles shadow removal under minimal supervision by employing a unidirectional two-branch network guided by a unify-and-adapt self-supervision paradigm. It learns to map differently shadowed inputs to a uniform shadow-free output and then adapts this output to a shadow-free reference domain via adversarial learning, without requiring paired ground-truth shadows. The method introduces a set of losses ($L_{os}$, $L_{perc}$, $L_{sfr}$, $L_{feat}$, $L_{id}$) and a GAN objective, achieving competitive RMSE on ISTD and AISTD while delivering superior qualitative results and lower compute. These findings indicate that self-supervised, cycle-free shadow removal can generalize well to real-world scenes with reduced data annotation, making it practical for pre-processing in downstream vision tasks.

Abstract

In this paper we present S3R-Net, the Self-Supervised Shadow Removal Network. The two-branch WGAN model achieves self-supervision relying on the unify-and-adaptphenomenon - it unifies the style of the output data and infers its characteristics from a database of unaligned shadow-free reference images. This approach stands in contrast to the large body of supervised frameworks. S3R-Net also differentiates itself from the few existing self-supervised models operating in a cycle-consistent manner, as it is a non-cyclic, unidirectional solution. The proposed framework achieves comparable numerical scores to recent selfsupervised shadow removal models while exhibiting superior qualitative performance and keeping the computational cost low.

S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal

TL;DR

S3R-Net tackles shadow removal under minimal supervision by employing a unidirectional two-branch network guided by a unify-and-adapt self-supervision paradigm. It learns to map differently shadowed inputs to a uniform shadow-free output and then adapts this output to a shadow-free reference domain via adversarial learning, without requiring paired ground-truth shadows. The method introduces a set of losses (, , , , ) and a GAN objective, achieving competitive RMSE on ISTD and AISTD while delivering superior qualitative results and lower compute. These findings indicate that self-supervised, cycle-free shadow removal can generalize well to real-world scenes with reduced data annotation, making it practical for pre-processing in downstream vision tasks.

Abstract

In this paper we present S3R-Net, the Self-Supervised Shadow Removal Network. The two-branch WGAN model achieves self-supervision relying on the unify-and-adaptphenomenon - it unifies the style of the output data and infers its characteristics from a database of unaligned shadow-free reference images. This approach stands in contrast to the large body of supervised frameworks. S3R-Net also differentiates itself from the few existing self-supervised models operating in a cycle-consistent manner, as it is a non-cyclic, unidirectional solution. The proposed framework achieves comparable numerical scores to recent selfsupervised shadow removal models while exhibiting superior qualitative performance and keeping the computational cost low.
Paper Structure (12 sections, 11 equations, 6 figures, 3 tables)

This paper contains 12 sections, 11 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: We present the architectures of a standard supervised shadow removal model (bottom left), a cycle-consistent self-supervised model (top) and the proposed S3R-Net (bottom right), exploiting a unify-and-adapt approach to self-supervision. The figure shows model inputs, key modules and the sources of supervisory signal. G and D denote the generator and the discriminator of a GAN, and SF/S subscripts are used to indicate networks generating/discriminating shadow-free/shadowed data.
  • Figure 2: Our S3R-Net system and its losses. The generators (G) shown above are the same exact model with the same weights.
  • Figure 3: Ablation study: visual impact of S3R-Net's losses.
  • Figure 4: Visual results on the ISTD dataset.
  • Figure 5: Model error vs train-time GFLOPS comparison between each model's generator(s)+discriminator(s) trained on full-size ISTD images. Circle radius is proportional to the total number of model parameters.
  • ...and 1 more figures