Table of Contents
Fetching ...

MotionWavelet: Human Motion Prediction via Wavelet Manifold Learning

Yuming Feng, Zhiyang Dou, Ling-Hao Chen, Yuan Liu, Tianyu Li, Jingbo Wang, Zeyu Cao, Wenping Wang, Taku Komura, Lingjie Liu

TL;DR

MotionWavelet, a human motion prediction framework that utilizes Wavelet Transformation and studies human motion patterns in the spatial-frequency domain, and develops Temporal Attention-Based Guidance to enhance prediction accuracy.

Abstract

Modeling temporal characteristics and the non-stationary dynamics of body movement plays a significant role in predicting human future motions. However, it is challenging to capture these features due to the subtle transitions involved in the complex human motions. This paper introduces MotionWavelet, a human motion prediction framework that utilizes Wavelet Transformation and studies human motion patterns in the spatial-frequency domain. In MotionWavelet, a Wavelet Diffusion Model (WDM) learns a Wavelet Manifold by applying Wavelet Transformation on the motion data therefore encoding the intricate spatial and temporal motion patterns. Once the Wavelet Manifold is built, WDM trains a diffusion model to generate human motions from Wavelet latent vectors. In addition to the WDM, MotionWavelet also presents a Wavelet Space Shaping Guidance mechanism to refine the denoising process to improve conformity with the manifold structure. WDM also develops Temporal Attention-Based Guidance to enhance prediction accuracy. Extensive experiments validate the effectiveness of MotionWavelet, demonstrating improved prediction accuracy and enhanced generalization across various benchmarks. Our code and models will be released upon acceptance.

MotionWavelet: Human Motion Prediction via Wavelet Manifold Learning

TL;DR

MotionWavelet, a human motion prediction framework that utilizes Wavelet Transformation and studies human motion patterns in the spatial-frequency domain, and develops Temporal Attention-Based Guidance to enhance prediction accuracy.

Abstract

Modeling temporal characteristics and the non-stationary dynamics of body movement plays a significant role in predicting human future motions. However, it is challenging to capture these features due to the subtle transitions involved in the complex human motions. This paper introduces MotionWavelet, a human motion prediction framework that utilizes Wavelet Transformation and studies human motion patterns in the spatial-frequency domain. In MotionWavelet, a Wavelet Diffusion Model (WDM) learns a Wavelet Manifold by applying Wavelet Transformation on the motion data therefore encoding the intricate spatial and temporal motion patterns. Once the Wavelet Manifold is built, WDM trains a diffusion model to generate human motions from Wavelet latent vectors. In addition to the WDM, MotionWavelet also presents a Wavelet Space Shaping Guidance mechanism to refine the denoising process to improve conformity with the manifold structure. WDM also develops Temporal Attention-Based Guidance to enhance prediction accuracy. Extensive experiments validate the effectiveness of MotionWavelet, demonstrating improved prediction accuracy and enhanced generalization across various benchmarks. Our code and models will be released upon acceptance.

Paper Structure

This paper contains 23 sections, 8 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Qualitative comparisons. The upper part shows predictions for Human3.6Mionescu2013human3, and the bottom part for HumanEva-Isigal2010humaneva. The first row in each part represents ground truth motion. The closer to the ground truth motion indicates better prediction.
  • Figure 2: More qualitative results of MotionWavelet, where the green-purple skeletons represent the observed motions, the blue-purple skeletons represent the GT motions, and the red-black skeletons represent the predicted motions. We visualize 10 predicted samples without overlay.
  • Figure 3: More qualitative results of MotionWavelet, where the green-purple skeletons represent the observed motions, and the red-black skeletons represent the predicted motions. We visualize 10 predicted samples. Our method produces high-fidelity and diverse motion prediction results.
  • Figure 4: Visualization of GT and predicted motion curves of the left wrist for "Walking Dog" and the right arm for "Discussion" in Human3.6M (vel. denotes velocity). The red curve and the blue line represent GT motion and MotionWavelet prediction. The purple line represents HumanMAC, and the green line represents DLow. MotionWavelet achieves better alignment with the ground truth motion.
  • Figure 5: Visualizations showcasing the joint-level control motion prediction results of MotionWavelet. The green-purple skeletons represent the observed joint motions, while the red-black skeletons represent 10 end poses of the predicted motions. The controlled joints are highlighted in yellow for clarity. The GT motions are in blue-purple.
  • ...and 5 more figures