Table of Contents
Fetching ...

ClimbingCap: Multi-Modal Dataset and Method for Rock Climbing in World Coordinate

Ming Yan, Xincheng Lin, Yuhua Luo, Shuqi Fan, Yudi Dai, Qixin Zhong, Lincai Zhong, Yuexin Ma, Lan Xu, Chenglu Wen, Siqi Shen, Cheng Wang

TL;DR

This work tackles the scarcity of global climbing motion data by introducing AscendMotion, a large-scale multi-modal dataset with synchronized RGB, LiDAR, and IMU/MoCap data from 22 climbers across 12 walls, including 344 minutes of labeled and 441 minutes of unlabeled sequences. It then presents ClimbingCap, a multimodal HMR framework with separate coordinate decoding, post-processing, and semi-supervised training to recover continuous $3D$ climbing motions in a world coordinate system, leveraging RGB for camera-space poses and LiDAR for global translations and ensuring cross-coordinate consistency via extrinsics. The authors demonstrate that ClimbingCap outperforms state-of-the-art methods on world-coordinate metrics, particularly in vertical climbing scenes, and show strong generalization to CIMI4D. The dataset annotation pipeline combines automatic global optimization with manual verification using SMPL-based labeling tools, underpinned by losses such as $\mathcal{L}_{GR}$ and $\mathcal{L}_{ST}$ to align pose and scene geometry. Code and data release accompany the work, enabling further research in global climbing motion capture and analysis.

Abstract

Human Motion Recovery (HMR) research mainly focuses on ground-based motions such as running. The study on capturing climbing motion, an off-ground motion, is sparse. This is partly due to the limited availability of climbing motion datasets, especially large-scale and challenging 3D labeled datasets. To address the insufficiency of climbing motion datasets, we collect AscendMotion, a large-scale well-annotated, and challenging climbing motion dataset. It consists of 412k RGB, LiDAR frames, and IMU measurements, including the challenging climbing motions of 22 skilled climbing coaches across 12 different rock walls. Capturing the climbing motions is challenging as it requires precise recovery of not only the complex pose but also the global position of climbers. Although multiple global HMR methods have been proposed, they cannot faithfully capture climbing motions. To address the limitations of HMR methods for climbing, we propose ClimbingCap, a motion recovery method that reconstructs continuous 3D human climbing motion in a global coordinate system. One key insight is to use the RGB and LiDAR modalities to separately reconstruct motions in camera coordinates and global coordinates and to optimize them jointly. We demonstrate the quality of the AscendMotion dataset and present promising results from ClimbingCap. The AscendMotion dataset and source code release publicly at \href{this link}{http://www.lidarhumanmotion.net/climbingcap/}

ClimbingCap: Multi-Modal Dataset and Method for Rock Climbing in World Coordinate

TL;DR

This work tackles the scarcity of global climbing motion data by introducing AscendMotion, a large-scale multi-modal dataset with synchronized RGB, LiDAR, and IMU/MoCap data from 22 climbers across 12 walls, including 344 minutes of labeled and 441 minutes of unlabeled sequences. It then presents ClimbingCap, a multimodal HMR framework with separate coordinate decoding, post-processing, and semi-supervised training to recover continuous climbing motions in a world coordinate system, leveraging RGB for camera-space poses and LiDAR for global translations and ensuring cross-coordinate consistency via extrinsics. The authors demonstrate that ClimbingCap outperforms state-of-the-art methods on world-coordinate metrics, particularly in vertical climbing scenes, and show strong generalization to CIMI4D. The dataset annotation pipeline combines automatic global optimization with manual verification using SMPL-based labeling tools, underpinned by losses such as and to align pose and scene geometry. Code and data release accompany the work, enabling further research in global climbing motion capture and analysis.

Abstract

Human Motion Recovery (HMR) research mainly focuses on ground-based motions such as running. The study on capturing climbing motion, an off-ground motion, is sparse. This is partly due to the limited availability of climbing motion datasets, especially large-scale and challenging 3D labeled datasets. To address the insufficiency of climbing motion datasets, we collect AscendMotion, a large-scale well-annotated, and challenging climbing motion dataset. It consists of 412k RGB, LiDAR frames, and IMU measurements, including the challenging climbing motions of 22 skilled climbing coaches across 12 different rock walls. Capturing the climbing motions is challenging as it requires precise recovery of not only the complex pose but also the global position of climbers. Although multiple global HMR methods have been proposed, they cannot faithfully capture climbing motions. To address the limitations of HMR methods for climbing, we propose ClimbingCap, a motion recovery method that reconstructs continuous 3D human climbing motion in a global coordinate system. One key insight is to use the RGB and LiDAR modalities to separately reconstruct motions in camera coordinates and global coordinates and to optimize them jointly. We demonstrate the quality of the AscendMotion dataset and present promising results from ClimbingCap. The AscendMotion dataset and source code release publicly at \href{this link}{http://www.lidarhumanmotion.net/climbingcap/}

Paper Structure

This paper contains 17 sections, 3 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Overview. To address the challenging problem of global climbing motion recovery, we collect the dataset AscendMotion, using LiDAR, RGB camera and Inertial Measurement Unit (IMU) motion capture system with accurate motion labels and global trajectories (the blue and orange human bodies in right side of the figure represent labeled motions, and the orange curve represents the motion trajectory in the world coordinate.). Meanwhile, we propose ClimbingCap, a global climbing motion capturing method in world coordinate. As shown in the left part of this figure, it uses both image and LiDAR point cloud to recover human motions.
  • Figure 2: Overview of ClimbingCap. The arrows indicate the three stages of the ClimbingCap framework: separate coordinate decoding(the green box), post-processing(the blue box), and semi-supervised training(the red box).
  • Figure 3: Dataset collection hardware system.
  • Figure 4: AscendMotion Annotation Pipeline. From left to right, this pipeline represents the three stages of dataset annotation: time-space synchronous input pre-processing(the blue box), multi-stage global optimization(the green box), and manual repair(the red box).
  • Figure 5: Qualitative Evaluation in the AscendMotion and CIMI4D dataset. The left and right areas show the results of Camera Coordinate and World Coordinate respectively. The red circles indicate obvious errors. The last row shows the results for CIMI4D dataset. Our method ClimbingCap performs best qualitatively by comparison.