Table of Contents
Fetching ...

Inpainting Normal Maps for Lightstage data

Hancheng Zuo, Bernard Tiddeman

TL;DR

The paper addresses inpainting of normal maps derived from lightstage data, where motion can obscure regions that must be plausibly reconstructed. It proposes a DCGAN-based inpainting framework with a bowtie-like generator and a discriminator, using masked normal-map inputs and a mask-channel approach, and optimizes a composite loss that blends a cosine-based reconstruction term with an adversarial term. To cope with limited training data, it employs extensive data augmentation and mask-style variations, and evaluates performance across three mask types and mask-input configurations using SSIM, PSNR, and discriminator accuracy. The results show the method can produce high-quality, realistic inpainted normal maps, with conclusions pointing to future improvements such as U-Net variants, normal-map integrability checks, and enhanced perceptual evaluation for relighting scenarios.

Abstract

This study introduces a novel method for inpainting normal maps using a generative adversarial network (GAN). Normal maps, often derived from a lightstage, are crucial in performance capture but can have obscured areas due to movement (e.g., by arms, hair, or props). Inpainting fills these missing areas with plausible data. Our approach extends previous general image inpainting techniques, employing a bow tie-like generator network and a discriminator network, with alternating training phases. The generator aims to synthesize images aligning with the ground truth and deceive the discriminator, which differentiates between real and processed images. Periodically, the discriminator undergoes retraining to enhance its ability to identify processed images. Importantly, our method adapts to the unique characteristics of normal map data, necessitating modifications to the loss function. We utilize a cosine loss instead of mean squared error loss for generator training. Limited training data availability, even with synthetic datasets, demands significant augmentation, considering the specific nature of the input data. This includes appropriate image flipping and in-plane rotations to accurately alter normal vectors. Throughout training, we monitored key metrics such as average loss, Structural Similarity Index Measure (SSIM), and Peak Signal-to-Noise Ratio (PSNR) for the generator, along with average loss and accuracy for the discriminator. Our findings suggest that the proposed model effectively generates high-quality, realistic inpainted normal maps, suitable for performance capture applications. These results establish a foundation for future research, potentially involving more advanced networks and comparisons with inpainting of source images used to create the normal maps.

Inpainting Normal Maps for Lightstage data

TL;DR

The paper addresses inpainting of normal maps derived from lightstage data, where motion can obscure regions that must be plausibly reconstructed. It proposes a DCGAN-based inpainting framework with a bowtie-like generator and a discriminator, using masked normal-map inputs and a mask-channel approach, and optimizes a composite loss that blends a cosine-based reconstruction term with an adversarial term. To cope with limited training data, it employs extensive data augmentation and mask-style variations, and evaluates performance across three mask types and mask-input configurations using SSIM, PSNR, and discriminator accuracy. The results show the method can produce high-quality, realistic inpainted normal maps, with conclusions pointing to future improvements such as U-Net variants, normal-map integrability checks, and enhanced perceptual evaluation for relighting scenarios.

Abstract

This study introduces a novel method for inpainting normal maps using a generative adversarial network (GAN). Normal maps, often derived from a lightstage, are crucial in performance capture but can have obscured areas due to movement (e.g., by arms, hair, or props). Inpainting fills these missing areas with plausible data. Our approach extends previous general image inpainting techniques, employing a bow tie-like generator network and a discriminator network, with alternating training phases. The generator aims to synthesize images aligning with the ground truth and deceive the discriminator, which differentiates between real and processed images. Periodically, the discriminator undergoes retraining to enhance its ability to identify processed images. Importantly, our method adapts to the unique characteristics of normal map data, necessitating modifications to the loss function. We utilize a cosine loss instead of mean squared error loss for generator training. Limited training data availability, even with synthetic datasets, demands significant augmentation, considering the specific nature of the input data. This includes appropriate image flipping and in-plane rotations to accurately alter normal vectors. Throughout training, we monitored key metrics such as average loss, Structural Similarity Index Measure (SSIM), and Peak Signal-to-Noise Ratio (PSNR) for the generator, along with average loss and accuracy for the discriminator. Our findings suggest that the proposed model effectively generates high-quality, realistic inpainted normal maps, suitable for performance capture applications. These results establish a foundation for future research, potentially involving more advanced networks and comparisons with inpainting of source images used to create the normal maps.
Paper Structure (14 sections, 3 equations, 4 figures, 2 tables)

This paper contains 14 sections, 3 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Example of augmentations, rows from left to right: Original; flipped; random zoom and rotation; flipped with (different) random zoom and rotation.
  • Figure 2: Overview of the GAN model for 256x256 input
  • Figure 3: From top to bottom Performance on With and Without on Irregular Lines Mask; From Left to Right compares of the Raw Image, Masked Image, Predicted Image, and Predicted Image in the Masked Region Only
  • Figure 4: From top to bottom Performance on Irregular Lines, Single Big Blob and Scattered Smaller Blobs Masks; From Left to Right compares of the Raw Image, Masked Image, Predicted Image, and Predicted Image in the Masked Region Only