Table of Contents
Fetching ...

Deep Generative Adversarial Network for Occlusion Removal from a Single Image

Sankaraganesh Jonna, Moushumi Medhi, Rajiv Ranjan Sahay

TL;DR

The paper tackles occlusion removal from a single image by proposing a two stage framework: OccNet for fence-like occlusion segmentation and a GAN based inpainting network that jointly recovers structure and texture. The completion network employs a dense, dilated encoder–decoder with self attention and gated convolutions, guided by texture and structure discriminators and trained with a composite loss $L=\lambda_1 L_{rec}+\lambda_2 L_{per}+\lambda_3 L_{str}+\lambda_4 L_{D_t}+\lambda_5 L_{D_s}$, along with a pre trained structure model to preserve geometry. Experiments on UCSD_Fence and IITKGP_Fence demonstrate strong segmentation performance, while Places2 and CelebA show artifact-free, semantically coherent inpainting with good generalization to irregular masks. The approach enables automatic, single-shot occlusion removal suitable for surveillance and forensics, and introduces the IITKGP_Fence dataset to benchmark fence-like occlusions.

Abstract

Nowadays, the enhanced capabilities of in-expensive imaging devices have led to a tremendous increase in the acquisition and sharing of multimedia content over the Internet. Despite advances in imaging sensor technology, annoying conditions like \textit{occlusions} hamper photography and may deteriorate the performance of applications such as surveillance, detection, and recognition. Occlusion segmentation is difficult because of scale variations, illumination changes, and so on. Similarly, recovering a scene from foreground occlusions also poses significant challenges due to the complexity of accurately estimating the occluded regions and maintaining coherence with the surrounding context. In particular, image de-fencing presents its own set of challenges because of the diverse variations in shape, texture, color, patterns, and the often cluttered environment. This study focuses on the automatic detection and removal of occlusions from a single image. We propose a fully automatic, two-stage convolutional neural network for fence segmentation and occlusion completion. We leverage generative adversarial networks (GANs) to synthesize realistic content, including both structure and texture, in a single shot for inpainting. To assess zero-shot generalization, we evaluated our trained occlusion detection model on our proposed fence-like occlusion segmentation dataset. The dataset can be found on GitHub.

Deep Generative Adversarial Network for Occlusion Removal from a Single Image

TL;DR

The paper tackles occlusion removal from a single image by proposing a two stage framework: OccNet for fence-like occlusion segmentation and a GAN based inpainting network that jointly recovers structure and texture. The completion network employs a dense, dilated encoder–decoder with self attention and gated convolutions, guided by texture and structure discriminators and trained with a composite loss , along with a pre trained structure model to preserve geometry. Experiments on UCSD_Fence and IITKGP_Fence demonstrate strong segmentation performance, while Places2 and CelebA show artifact-free, semantically coherent inpainting with good generalization to irregular masks. The approach enables automatic, single-shot occlusion removal suitable for surveillance and forensics, and introduces the IITKGP_Fence dataset to benchmark fence-like occlusions.

Abstract

Nowadays, the enhanced capabilities of in-expensive imaging devices have led to a tremendous increase in the acquisition and sharing of multimedia content over the Internet. Despite advances in imaging sensor technology, annoying conditions like \textit{occlusions} hamper photography and may deteriorate the performance of applications such as surveillance, detection, and recognition. Occlusion segmentation is difficult because of scale variations, illumination changes, and so on. Similarly, recovering a scene from foreground occlusions also poses significant challenges due to the complexity of accurately estimating the occluded regions and maintaining coherence with the surrounding context. In particular, image de-fencing presents its own set of challenges because of the diverse variations in shape, texture, color, patterns, and the often cluttered environment. This study focuses on the automatic detection and removal of occlusions from a single image. We propose a fully automatic, two-stage convolutional neural network for fence segmentation and occlusion completion. We leverage generative adversarial networks (GANs) to synthesize realistic content, including both structure and texture, in a single shot for inpainting. To assess zero-shot generalization, we evaluated our trained occlusion detection model on our proposed fence-like occlusion segmentation dataset. The dataset can be found on GitHub.
Paper Structure (18 sections, 10 equations, 11 figures, 3 tables)

This paper contains 18 sections, 10 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: OccNet: Deep fence-like occlusion segmentation architecture.
  • Figure 2: (a) Input image. (b)-(d) Edge-preserving structures obtained using $L0$Xu_2011_TOG, RTV Xu_2012_TOG, and the pretrained model in Fan_2019_PAMI, respectively.
  • Figure 3: Schematic of the proposed image inpainting network.
  • Figure 4: Self-attention module Ashish_2017_NIPSWang_2018_CVPR.
  • Figure 5: Sample fence images from UCSD_Fence dataset Du_2018_ICME.
  • ...and 6 more figures