Panoramic Image Inpainting With Gated Convolution And Contextual Reconstruction Loss
Li Yu, Yanjun Gao, Farhad Pakdaman, Moncef Gabbouj
TL;DR
This work tackles panoramic image inpainting by operating on cubemap-projected CMP inputs and employing a two-generator architecture (Face Generator and Cube Generator) that uses gated convolutions to distinguish valid versus invalid pixels. A side branch with contextual reconstruction loss guides the network to select the most appropriate reference patches, while Slice and Whole discriminators enforce per-face realism and inter-face consistency. Training relies on a WGAN framework with gradient penalty, augmented by $L_1$ losses on masked and unmasked regions and the CR loss $L_{CR}$ to reduce ambiguity in patch selection. Empirical results on the SUN360 dataset show superior PSNR/SSIM performance over state-of-the-art methods across mask ratios, with ablation studies confirming the contributions of gated convolutions and CR loss, and the CMP input reducing pole distortion compared to ERP-based approaches.
Abstract
Deep learning-based methods have demonstrated encouraging results in tackling the task of panoramic image inpainting. However, it is challenging for existing methods to distinguish valid pixels from invalid pixels and find suitable references for corrupted areas, thus leading to artifacts in the inpainted results. In response to these challenges, we propose a panoramic image inpainting framework that consists of a Face Generator, a Cube Generator, a side branch, and two discriminators. We use the Cubemap Projection (CMP) format as network input. The generator employs gated convolutions to distinguish valid pixels from invalid ones, while a side branch is designed utilizing contextual reconstruction (CR) loss to guide the generators to find the most suitable reference patch for inpainting the missing region. The proposed method is compared with state-of-the-art (SOTA) methods on SUN360 Street View dataset in terms of PSNR and SSIM. Experimental results and ablation study demonstrate that the proposed method outperforms SOTA both quantitatively and qualitatively.
