Table of Contents
Fetching ...

Human Pose Estimation in Trampoline Gymnastics: Improving Performance Using a New Synthetic Dataset

Léa Drolet-Roy, Victor Nogues, Sylvain Gaudet, Eve Charbonneau, Mickaël Begon, Lama Séoud

Abstract

Trampoline gymnastics involves extreme human poses and uncommon viewpoints, on which state-of-the art pose estimation models tend to under-perform. We demonstrate that this problem can be addressed by fine-tuning a pose estimation model on a dataset of synthetic trampoline poses (STP). STP is generated from motion capture recordings of trampoline routines. We develop a pipeline to fit noisy motion capture data to a parametric human model, then generate multiview realistic images. We use this data to fine-tune a ViTPose model, and test it on real multi-view trampoline images. The resulting model exhibits accuracy improvements in 2D which translates to improved 3D triangulation. In 2D, we obtain state-of-the-art results on such challenging data, bridging the performance gap between common and extreme poses. In 3D, we reduce the MPJPE by 12.5 mm with our best model, which represents an improvement of 19.6% compared to the pretrained ViTPose model.

Human Pose Estimation in Trampoline Gymnastics: Improving Performance Using a New Synthetic Dataset

Abstract

Trampoline gymnastics involves extreme human poses and uncommon viewpoints, on which state-of-the art pose estimation models tend to under-perform. We demonstrate that this problem can be addressed by fine-tuning a pose estimation model on a dataset of synthetic trampoline poses (STP). STP is generated from motion capture recordings of trampoline routines. We develop a pipeline to fit noisy motion capture data to a parametric human model, then generate multiview realistic images. We use this data to fine-tune a ViTPose model, and test it on real multi-view trampoline images. The resulting model exhibits accuracy improvements in 2D which translates to improved 3D triangulation. In 2D, we obtain state-of-the-art results on such challenging data, bridging the performance gap between common and extreme poses. In 3D, we reduce the MPJPE by 12.5 mm with our best model, which represents an improvement of 19.6% compared to the pretrained ViTPose model.

Paper Structure

This paper contains 17 sections, 9 figures, 1 table.

Figures (9)

  • Figure 1: Qualitative comparison of the ViTPose-s without (Baseline) and with fine-tuning (Ours), 2 views out of 8 are selected to show the performance boost. Illustrated triangulations are obtained from 8 views, using Pose2Sim triangulation pagnon_pose2sim_2022
  • Figure 2: Markers-to-SMPL fitting pipeline. The illustrated frames are not consecutive in the full motion, they are displayed as an example.
  • Figure 3: Distribution of distances between the observed markers and the synthetic ones placed on the SMPL avatar body. Marker labels correspond to anatomical landmarks.
  • Figure 4: Examples of synthetic images in STP (images are cropped for clarity)
  • Figure 5: Multi-view acquisition setup. A trampolinist in motion is represented as a skeleton at different positions.
  • ...and 4 more figures