Table of Contents
Fetching ...

Correspondence-free online human motion retargeting

Rim Rekik, Mathieu Marsot, Anne-Hélène Olivier, Jean-Sébastien Franco, Stefanie Wuhrer

TL;DR

This work tackles correspondence-free online retargeting of dense 3D human motion onto arbitrary target shapes using unstructured 4D data. It fuses a skeleton-based motion transfer with a geometry-aware deformation model, and learns long-term temporal context via a two-loss framework that combines $L_{motion}$ and $L_{geom}$ in an online, single-pass-per-frame pipeline. The approach demonstrates state-of-the-art performance among correspondence-free methods, with strong generalization to unseen motions and shapes and the ability to animate targets from raw 4D multi-view data. The results suggest broad applicability to avatar animation and 4D data augmentation, and the accompanying code is publicly available for reuse.

Abstract

We present a data-driven framework for unsupervised human motion retargeting that animates a target subject with the motion of a source subject. Our method is correspondence-free, requiring neither spatial correspondences between the source and target shapes nor temporal correspondences between different frames of the source motion. This allows to animate a target shape with arbitrary sequences of humans in motion, possibly captured using 4D acquisition platforms or consumer devices. Our method unifies the advantages of two existing lines of work, namely skeletal motion retargeting, which leverages long-term temporal context, and surface-based retargeting, which preserves surface details, by combining a geometry-aware deformation model with a skeleton-aware motion transfer approach. This allows to take into account long-term temporal context while accounting for surface details. During inference, our method runs online, i.e. input can be processed in a serial way, and retargeting is performed in a single forward pass per frame. Experiments show that including long-term temporal context during training improves the method's accuracy for skeletal motion and detail preservation. Furthermore, our method generalizes to unobserved motions and body shapes. We demonstrate that our method achieves state-of-the-art results on two test datasets and that it can be used to animate human models with the output of a multi-view acquisition platform. Code is available at \url{https://gitlab.inria.fr/rrekikdi/human-motion-retargeting2023}.

Correspondence-free online human motion retargeting

TL;DR

This work tackles correspondence-free online retargeting of dense 3D human motion onto arbitrary target shapes using unstructured 4D data. It fuses a skeleton-based motion transfer with a geometry-aware deformation model, and learns long-term temporal context via a two-loss framework that combines and in an online, single-pass-per-frame pipeline. The approach demonstrates state-of-the-art performance among correspondence-free methods, with strong generalization to unseen motions and shapes and the ability to animate targets from raw 4D multi-view data. The results suggest broad applicability to avatar animation and 4D data augmentation, and the accompanying code is publicly available for reuse.

Abstract

We present a data-driven framework for unsupervised human motion retargeting that animates a target subject with the motion of a source subject. Our method is correspondence-free, requiring neither spatial correspondences between the source and target shapes nor temporal correspondences between different frames of the source motion. This allows to animate a target shape with arbitrary sequences of humans in motion, possibly captured using 4D acquisition platforms or consumer devices. Our method unifies the advantages of two existing lines of work, namely skeletal motion retargeting, which leverages long-term temporal context, and surface-based retargeting, which preserves surface details, by combining a geometry-aware deformation model with a skeleton-aware motion transfer approach. This allows to take into account long-term temporal context while accounting for surface details. During inference, our method runs online, i.e. input can be processed in a serial way, and retargeting is performed in a single forward pass per frame. Experiments show that including long-term temporal context during training improves the method's accuracy for skeletal motion and detail preservation. Furthermore, our method generalizes to unobserved motions and body shapes. We demonstrate that our method achieves state-of-the-art results on two test datasets and that it can be used to animate human models with the output of a multi-view acquisition platform. Code is available at \url{https://gitlab.inria.fr/rrekikdi/human-motion-retargeting2023}.
Paper Structure (39 sections, 4 equations, 8 figures, 10 tables)

This paper contains 39 sections, 4 equations, 8 figures, 10 tables.

Figures (8)

  • Figure 2: Our method takes a source sequence of unstructured point clouds and a target point cloud as input, and outputs the target character performing the input motion. The method first retargets the source motion to the target character at the skeletal level (red part), and then adds the target's surface details to the resulting motion using a deformation model (green part).
  • Figure 3: Animating target shapes with untracked captured 4D data directly. We consider a walking motion (top) and a kicking motion (bottom), which are retargeted to a naked (left),clothed (middle) and a CAD-generated (right) target shape.
  • Figure 4: Ablation of SMRM. Retargeting result on a challenging HipHop motion from a female to a male body shape. Quaternions are prone to twist, introducing $\mathcal{L}_{rot}$ improves the head and feet retargeting.
  • Figure 5: Ablation of 3D Skeleton extraction: with PointNet (middle) and PointFormer (right) on a challenging pose from the AMASS test set (right). Ground truth shown in red, regressed results in blue.
  • Figure 6: Screenshot of the stimuli and questions presented to participants during the user study.
  • ...and 3 more figures