Table of Contents
Fetching ...

Analyzing and Improving the Skin Tone Consistency and Bias in Implicit 3D Relightable Face Generators

Libing Zeng, Nima Khademi Kalantari

TL;DR

This work normalizes the SH coefficients by their DC term to eliminate the inherent magnitude bias, while statistically align the coefficients in the other bands to alleviate the directional bias, and proposes a scaling strategy to match the distribution of illumination magnitude in the generated images with the training data.

Abstract

With the advances in generative adversarial networks (GANs) and neural rendering, 3D relightable face generation has received significant attention. Among the existing methods, a particularly successful technique uses an implicit lighting representation and generates relit images through the product of synthesized albedo and light-dependent shading images. While this approach produces high-quality results with intricate shading details, it often has difficulty producing relit images with consistent skin tones, particularly when the lighting condition is extracted from images of individuals with dark skin. Additionally, this technique is biased towards producing albedo images with lighter skin tones. Our main observation is that this problem is rooted in the biased spherical harmonics (SH) coefficients, used during training. Following this observation, we conduct an analysis and demonstrate that the bias appears not only in band 0 (DC term), but also in the other bands of the estimated SH coefficients. We then propose a simple, but effective, strategy to mitigate the problem. Specifically, we normalize the SH coefficients by their DC term to eliminate the inherent magnitude bias, while statistically align the coefficients in the other bands to alleviate the directional bias. We also propose a scaling strategy to match the distribution of illumination magnitude in the generated images with the training data. Through extensive experiments, we demonstrate the effectiveness of our solution in increasing the skin tone consistency and mitigating bias.

Analyzing and Improving the Skin Tone Consistency and Bias in Implicit 3D Relightable Face Generators

TL;DR

This work normalizes the SH coefficients by their DC term to eliminate the inherent magnitude bias, while statistically align the coefficients in the other bands to alleviate the directional bias, and proposes a scaling strategy to match the distribution of illumination magnitude in the generated images with the training data.

Abstract

With the advances in generative adversarial networks (GANs) and neural rendering, 3D relightable face generation has received significant attention. Among the existing methods, a particularly successful technique uses an implicit lighting representation and generates relit images through the product of synthesized albedo and light-dependent shading images. While this approach produces high-quality results with intricate shading details, it often has difficulty producing relit images with consistent skin tones, particularly when the lighting condition is extracted from images of individuals with dark skin. Additionally, this technique is biased towards producing albedo images with lighter skin tones. Our main observation is that this problem is rooted in the biased spherical harmonics (SH) coefficients, used during training. Following this observation, we conduct an analysis and demonstrate that the bias appears not only in band 0 (DC term), but also in the other bands of the estimated SH coefficients. We then propose a simple, but effective, strategy to mitigate the problem. Specifically, we normalize the SH coefficients by their DC term to eliminate the inherent magnitude bias, while statistically align the coefficients in the other bands to alleviate the directional bias. We also propose a scaling strategy to match the distribution of illumination magnitude in the generated images with the training data. Through extensive experiments, we demonstrate the effectiveness of our solution in increasing the skin tone consistency and mitigating bias.

Paper Structure

This paper contains 17 sections, 4 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: On the top, we show two relit images produced by NeRFFaceLighting (Jiang et al.) NeRFFaceLighting2023Jiang, using the lighting extracted from images of individuals with fair and dark skin tones (shown on the right). As seen, NeRFFaceLighting produces relit images with inconsistent skin tones. Additionally, when distilling the EG3D triplane, NeRFFaceLighting tends to produce albedo maps that are biased towards lighter skin colors. Our method mitigates this bias and improves the consistency of the skin tone in relit images. Note that even though we use the same latent vector to generate the results with EG3D, NeRFFaceLighting, and ours, there are variation in the images as the backbone EG3D network is fine-tuned separately in NeRFFaceLighting and ours.
  • Figure 2: We visualize the $2^\text{nd}$ order SH coefficients estimated using SfSNet sfsnetSengupta18 and DECA DECA2021Siggraph from 400 images with different skin colors (100 in each category). We use t-SNE to visualize the coefficients in 2D. The coefficients extracted from images with dark skin form a distinct cluster in both cases.
  • Figure 3: We visualize the SH coefficients, estimated by SfSNet, in band 0 and other bands. We augment the one dimensional coefficients in band 0 with an additional randomly filled dimension for better visualization. For other bands, however, we use t-SNE to reduce the dimensions from eight to two. As seen, the bias is not limited to the magnitude of the lighting (band 0) and appears in other higher order SH coefficients as well.
  • Figure 4: On the left, we visualize the SH coefficients after normalization. On the right, we showcase the normalized SH coefficients with statistical alignment of the coefficients of dark to non-dark skin tones. This approach effectively mitigates bias in the estimated SH coefficients.
  • Figure 5: We compare different approaches by producing relit images using two target lightings. Our approach produces results with consistent skin tone for both lightings.
  • ...and 5 more figures