Table of Contents
Fetching ...

TAVA: Template-free Animatable Volumetric Actors

Ruilong Li, Julian Tanke, Minh Vo, Michael Zollhofer, Jurgen Gall, Angjoo Kanazawa, Christoph Lassner

TL;DR

TAVA introduces Template-free Animatable Volumetric Actors, a forward-skinning, canonical-space neural actor that animates from multi-view data without a fixed body template. It couples a canonical neural radiance field with a forward LBS-based deformation and a learned non-linear residual, enabling novel-pose animation and dense correspondences while supporting editing. The method achieves strong pose-generalization and competitive rendering quality against template-based baselines, and outperforms template-free methods on both human and animal subjects, with robust dense correspondence for content editing. By grounding deformation in a pose-independent canonical space and employing end-to-end training with ambient occlusion, TAVA offers a flexible, editable representation suitable for cross-species avatars and content-creation applications.

Abstract

Coordinate-based volumetric representations have the potential to generate photo-realistic virtual avatars from images. However, virtual avatars also need to be controllable even to a novel pose that may not have been observed. Traditional techniques, such as LBS, provide such a function; yet it usually requires a hand-designed body template, 3D scan data, and limited appearance models. On the other hand, neural representation has been shown to be powerful in representing visual details, but are under explored on deforming dynamic articulated actors. In this paper, we propose TAVA, a method to create T emplate-free Animatable Volumetric Actors, based on neural representations. We rely solely on multi-view data and a tracked skeleton to create a volumetric model of an actor, which can be animated at the test time given novel pose. Since TAVA does not require a body template, it is applicable to humans as well as other creatures such as animals. Furthermore, TAVA is designed such that it can recover accurate dense correspondences, making it amenable to content-creation and editing tasks. Through extensive experiments, we demonstrate that the proposed method generalizes well to novel poses as well as unseen views and showcase basic editing capabilities.

TAVA: Template-free Animatable Volumetric Actors

TL;DR

TAVA introduces Template-free Animatable Volumetric Actors, a forward-skinning, canonical-space neural actor that animates from multi-view data without a fixed body template. It couples a canonical neural radiance field with a forward LBS-based deformation and a learned non-linear residual, enabling novel-pose animation and dense correspondences while supporting editing. The method achieves strong pose-generalization and competitive rendering quality against template-based baselines, and outperforms template-free methods on both human and animal subjects, with robust dense correspondence for content editing. By grounding deformation in a pose-independent canonical space and employing end-to-end training with ambient occlusion, TAVA offers a flexible, editable representation suitable for cross-species avatars and content-creation applications.

Abstract

Coordinate-based volumetric representations have the potential to generate photo-realistic virtual avatars from images. However, virtual avatars also need to be controllable even to a novel pose that may not have been observed. Traditional techniques, such as LBS, provide such a function; yet it usually requires a hand-designed body template, 3D scan data, and limited appearance models. On the other hand, neural representation has been shown to be powerful in representing visual details, but are under explored on deforming dynamic articulated actors. In this paper, we propose TAVA, a method to create T emplate-free Animatable Volumetric Actors, based on neural representations. We rely solely on multi-view data and a tracked skeleton to create a volumetric model of an actor, which can be animated at the test time given novel pose. Since TAVA does not require a body template, it is applicable to humans as well as other creatures such as animals. Furthermore, TAVA is designed such that it can recover accurate dense correspondences, making it amenable to content-creation and editing tasks. Through extensive experiments, we demonstrate that the proposed method generalizes well to novel poses as well as unseen views and showcase basic editing capabilities.
Paper Structure (23 sections, 16 equations, 12 figures, 7 tables)

This paper contains 23 sections, 16 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: Method Overview.Left: TAVA creates a virtual actor from multiple sparse video views as well as 3D poses. The same skeleton can later be used for animation. Center: TAVA uses this information to create a canonical shape and a pose-dependent skinning function and establishes correspondences across poses. The resulting model can be used for rendering and posing the virtual character as well as editing it. Right: the method can directly used for other creatures as long as a 3D skeleton can be defined.
  • Figure 2: TAVA Overview. We use volumetric rendering techniques to create the actor representation. For each sampled point, we use LBS based non-linear deformation combined with a blending weight model for which we identify the root in the canonical space. In this space, we use a color, density, and ambient occlusion model to parameterize the appearance.
  • Figure 3: Comparison with template-free methods on the Hare and Wolf subjects.
  • Figure 4: Rendering quality comparison with all baseline methods on the ZJU-Mocap Dataset. Note that Animatable-NeRF and NeuralBody rely on the SMPL body model, and the other approaches do not.
  • Figure 5: Rendering with Dense Correspondence. We show results of our novel-view rendering with dense correspondences. On the ZJU Mocap dataset, correspondences across different subjects can also be built because they share the same canonical pose.
  • ...and 7 more figures