Table of Contents
Fetching ...

PrismAvatar: Real-time animated 3D neural head avatars on edge devices

Prashant Raina, Felix Taubner, Mathieu Tuli, Eu Wern Teh, Kevin Ferreira

TL;DR

PrismAvatar tackles real-time neural head avatars on edge devices by coupling a FLAME-based 3DMM with a deformable NeRF learned over a prism lattice. A hybrid mesh-volume representation is used during training, followed by distillation to a rigged triangle mesh with neural textures for fast, rasterization-based inference. The approach achieves 60 fps on mobile browsers with modest memory usage while delivering image quality competitive with desktop-state-of-the-art avatars. This work enables practical, cross-device head avatars suitable for web and edge deployments, including hair and facial detail handling.

Abstract

We present PrismAvatar: a 3D head avatar model which is designed specifically to enable real-time animation and rendering on resource-constrained edge devices, while still enjoying the benefits of neural volumetric rendering at training time. By integrating a rigged prism lattice with a 3D morphable head model, we use a hybrid rendering model to simultaneously reconstruct a mesh-based head and a deformable NeRF model for regions not represented by the 3DMM. We then distill the deformable NeRF into a rigged mesh and neural textures, which can be animated and rendered efficiently within the constraints of the traditional triangle rendering pipeline. In addition to running at 60 fps with low memory usage on mobile devices, we find that our trained models have comparable quality to state-of-the-art 3D avatar models on desktop devices.

PrismAvatar: Real-time animated 3D neural head avatars on edge devices

TL;DR

PrismAvatar tackles real-time neural head avatars on edge devices by coupling a FLAME-based 3DMM with a deformable NeRF learned over a prism lattice. A hybrid mesh-volume representation is used during training, followed by distillation to a rigged triangle mesh with neural textures for fast, rasterization-based inference. The approach achieves 60 fps on mobile browsers with modest memory usage while delivering image quality competitive with desktop-state-of-the-art avatars. This work enables practical, cross-device head avatars suitable for web and edge deployments, including hair and facial detail handling.

Abstract

We present PrismAvatar: a 3D head avatar model which is designed specifically to enable real-time animation and rendering on resource-constrained edge devices, while still enjoying the benefits of neural volumetric rendering at training time. By integrating a rigged prism lattice with a 3D morphable head model, we use a hybrid rendering model to simultaneously reconstruct a mesh-based head and a deformable NeRF model for regions not represented by the 3DMM. We then distill the deformable NeRF into a rigged mesh and neural textures, which can be animated and rendered efficiently within the constraints of the traditional triangle rendering pipeline. In addition to running at 60 fps with low memory usage on mobile devices, we find that our trained models have comparable quality to state-of-the-art 3D avatar models on desktop devices.

Paper Structure

This paper contains 17 sections, 4 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Left: The volumetric field for hair in our hybrid model is defined over a prism lattice, constructed as described in Section \ref{['subsec_lattice']}. Right: At the time of export (Section \ref{['subsec_model_export']}), we prune the lattice to remove triangles that are invisible or occluded.
  • Figure 2: Illustration of different scenarios in ray intersection. Ray $R_1$ hits the FLAME mesh first, so its color is sampled from a learned texture. Ray $R_3$ intersects only triangles of the lattice, so its color is obtained by a volume rendering integral over the intersection points. Ray $R_2$ intersects the lattice before terminating on the FLAME mesh, so its color is obtained by interpolating the volume rendering integral before the last intersection with the color of the final intersection, using the accumulated opacity as the interpolation factor.
  • Figure 3: A sample of head avatars reconstructed using our method.
  • Figure 4: Using a prism lattice which covers the face allows us to reconstruct facial hair. Thick dark hair and thin blond hair are both reconstructed by our method. Top: The prism lattice for facial hair and an example of a reconstructed avatar. Bottom: Frames showing the deformation of the mustache in response to changes in the facial expression.