Table of Contents
Fetching ...

NBAvatar: Neural Billboards Avatars with Realistic Hand-Face Interaction

David Svitov, Mahtab Dahaghin

Abstract

We present NBAvatar - a method for realistic rendering of head avatars handling non-rigid deformations caused by hand-face interaction. We introduce a novel representation for animated avatars by combining the training of oriented planar primitives with neural rendering. Such a combination of explicit and implicit representations enables NBAvatar to handle temporally and pose-consistent geometry, along with fine-grained appearance details provided by the neural rendering technique. In our experiments, we demonstrate that NBAvatar implicitly learns color transformations caused by face-hand interactions and surpasses existing approaches in terms of novel-view and novel-pose rendering quality. Specifically, NBAvatar achieves up to 30% LPIPS reduction under high-resolution megapixel rendering compared to Gaussian-based avatar methods, while also improving PSNR and SSIM, and achieves higher structural similarity compared to the state-of-the-art hand-face interaction method InteractAvatar.

NBAvatar: Neural Billboards Avatars with Realistic Hand-Face Interaction

Abstract

We present NBAvatar - a method for realistic rendering of head avatars handling non-rigid deformations caused by hand-face interaction. We introduce a novel representation for animated avatars by combining the training of oriented planar primitives with neural rendering. Such a combination of explicit and implicit representations enables NBAvatar to handle temporally and pose-consistent geometry, along with fine-grained appearance details provided by the neural rendering technique. In our experiments, we demonstrate that NBAvatar implicitly learns color transformations caused by face-hand interactions and surpasses existing approaches in terms of novel-view and novel-pose rendering quality. Specifically, NBAvatar achieves up to 30% LPIPS reduction under high-resolution megapixel rendering compared to Gaussian-based avatar methods, while also improving PSNR and SSIM, and achieves higher structural similarity compared to the state-of-the-art hand-face interaction method InteractAvatar.
Paper Structure (19 sections, 8 equations, 10 figures, 7 tables)

This paper contains 19 sections, 8 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Novel view synthesis. We present the method for photo-realistic novel-view rendering of non-rigid hand-face interaction scenarios for human avatars. a) Our method achieves photo-realistic rendering quality for novel views and poses, surpassing state-of-the-art InteractAvatar chen2025interactavatar results. b) The method enables cross-actor enactment by transferring hand and face poses across different subjects.
  • Figure 2: Method description. a) For multi-view input video, we fit FLAME li2017learning and MANO romero2022embodied parametric models and reconstruct coarse face deformation with position-based dynamics (PBD) muller2007position. b) Then, for each polygon in the mesh, we anchor Neural Billboards by computing its orientation as offsets relative to the polygon. We rasterize Neural Billboards into a 6-channel image and a corresponding alpha map. c) Finally, we transform this rasterization to RGB with the U-Net renderer $R$ while regularizing the alpha map to fit the ground truth silhouette.
  • Figure 3: Neural billboards rasterization. a) We initialize the alpha texture $T_i^\alpha$ with a Gaussian distribution, and fill neural texture $T_i^\textrm{NT}$ with corresponding spectral coordinates on the mesh surface. During training, we optimize both textures with gradients from the rasterizer. b) We rasterize screen point $x$ by accumulating corresponding texture values along the ray. The resulting $I_{f}^\textrm{NB}$ transforms to RGB with a trainable renderer $R$.
  • Figure 4: Qualitative comparison with InteractAvatar chen2025interactavatar. Despite achieving lower PSNR under their evaluation protocol, NBAvatar produces sharper details and more realistic hand-face deformations. InteractAvatar exhibits characteristic 3DGS artifacts including blurry facial textures and protruding Gaussians along the avatar boundary.
  • Figure 5: Qualitative comparison on novel views. We compare our method with SplattingAvatar shao2024splattingavatar and GaussianAvatars qian2024gaussianavatars on the Decaf shimada2023decaf dataset. NBAvatar produces sharp, high-fidelity reconstructions of non-rigid facial deformations and dynamic hand appearance, accurately capturing facial details.
  • ...and 5 more figures