Table of Contents
Fetching ...

FFaceNeRF: Few-shot Face Editing in Neural Radiance Fields

Kwan Yun, Chaelin Kim, Hangyeul Shin, Junyong Noh

TL;DR

FFaceNeRF tackles the challenge of flexible, 3D-aware face editing with masks by introducing a geometry adapter that adapts a fixed segmentation layout, enabling user-specified masks with very few training samples. The approach leverages Latent Mixing for Triplane Augmentation to diversify tri-plane inputs during training and an overlap-based optimization to handle small edited regions during inference. Empirical results show superior fidelity to target masks, better multi-view consistency, and favorable perceptual judgments compared with baselines, with ablations confirming the importance of feature injection and LMTA. The method supports practical applications such as partial style transfer and can extend to related frameworks like FFaceGAN, signaling broad potential for rapid, high-fidelity 3D face editing in domains including personalized avatars and medical visualization.

Abstract

Recent 3D face editing methods using masks have produced high-quality edited images by leveraging Neural Radiance Fields (NeRF). Despite their impressive performance, existing methods often provide limited user control due to the use of pre-trained segmentation masks. To utilize masks with a desired layout, an extensive training dataset is required, which is challenging to gather. We present FFaceNeRF, a NeRF-based face editing technique that can overcome the challenge of limited user control due to the use of fixed mask layouts. Our method employs a geometry adapter with feature injection, allowing for effective manipulation of geometry attributes. Additionally, we adopt latent mixing for tri-plane augmentation, which enables training with a few samples. This facilitates rapid model adaptation to desired mask layouts, crucial for applications in fields like personalized medical imaging or creative face editing. Our comparative evaluations demonstrate that FFaceNeRF surpasses existing mask based face editing methods in terms of flexibility, control, and generated image quality, paving the way for future advancements in customized and high-fidelity 3D face editing. The code is available on the {\href{https://kwanyun.github.io/FFaceNeRF_page/}{project-page}}.

FFaceNeRF: Few-shot Face Editing in Neural Radiance Fields

TL;DR

FFaceNeRF tackles the challenge of flexible, 3D-aware face editing with masks by introducing a geometry adapter that adapts a fixed segmentation layout, enabling user-specified masks with very few training samples. The approach leverages Latent Mixing for Triplane Augmentation to diversify tri-plane inputs during training and an overlap-based optimization to handle small edited regions during inference. Empirical results show superior fidelity to target masks, better multi-view consistency, and favorable perceptual judgments compared with baselines, with ablations confirming the importance of feature injection and LMTA. The method supports practical applications such as partial style transfer and can extend to related frameworks like FFaceGAN, signaling broad potential for rapid, high-fidelity 3D face editing in domains including personalized avatars and medical visualization.

Abstract

Recent 3D face editing methods using masks have produced high-quality edited images by leveraging Neural Radiance Fields (NeRF). Despite their impressive performance, existing methods often provide limited user control due to the use of pre-trained segmentation masks. To utilize masks with a desired layout, an extensive training dataset is required, which is challenging to gather. We present FFaceNeRF, a NeRF-based face editing technique that can overcome the challenge of limited user control due to the use of fixed mask layouts. Our method employs a geometry adapter with feature injection, allowing for effective manipulation of geometry attributes. Additionally, we adopt latent mixing for tri-plane augmentation, which enables training with a few samples. This facilitates rapid model adaptation to desired mask layouts, crucial for applications in fields like personalized medical imaging or creative face editing. Our comparative evaluations demonstrate that FFaceNeRF surpasses existing mask based face editing methods in terms of flexibility, control, and generated image quality, paving the way for future advancements in customized and high-fidelity 3D face editing. The code is available on the {\href{https://kwanyun.github.io/FFaceNeRF_page/}{project-page}}.

Paper Structure

This paper contains 24 sections, 4 equations, 14 figures, 3 tables.

Figures (14)

  • Figure 1: Results of FFaceNeRF. With few-shot training, our method can edit 3D-aware images from desired layouts.
  • Figure 2: Pretraining stage of FFaceNeRF following EG3D chan2022efficient and NeRFFaceEditing jiang2022nerffaceediting for disentangled representation.
  • Figure 3: Overview of FFaceNeRF. LMTA is conducted during the training of $\Phi_{geo}$. $\Phi_{geo}$ takes as input the concatenation of normalized tri-plane feature $\hat{F'}_{tri}$ (yellow box), view direction $v_d$ (white box), outputs of $\Psi_{geo}$, which are segmentation labels $Seg$ (blue box), and density $\sigma$ (red box). Density $\sigma$ is directly used from the output of $\Psi_{geo}$, without further training using $\Phi_{geo}$.
  • Figure 4: Examples of dataset with different segmentation layouts. Green boxes are close-up views of eye regions while red boxes are close-up views of nose regions.
  • Figure 5: Semantics-augmentation tradeoff: When mixing earlier layers, semantics and tri-plane feature information change largely (high L1, low mIoU). On the other hand, when mixing later layers, semantics and augmentation change little (low L1, high mIoU).
  • ...and 9 more figures