Table of Contents
Fetching ...

Sparkle: A Robust and Versatile Representation for Point Cloud based Human Motion Capture

Yiming Ren, Yujing Sun, Aoru Xue, Kwok-Yan Lam, Yuexin Ma

Abstract

Point cloud-based motion capture leverages rich spatial geometry and privacy-preserving sensing, but learning robust representations from noisy, unstructured point clouds remains challenging. Existing approaches face a struggle trade-off between point-based methods (geometrically detailed but noisy) and skeleton-based ones (robust but oversimplified). We address the fundamental challenge: how to construct an effective representation for human motion capture that can balance expressiveness and robustness. In this paper, we propose Sparkle, a structured representation unifying skeletal joints and surface anchors with explicit kinematic-geometric factorization. Our framework, SparkleMotion, learns this representation through hierarchical modules embedding geometric continuity and kinematic constraints. By explicitly disentangling internal kinematic structure from external surface geometry, SparkleMotion achieves state-of-the-art performance not only in accuracy but crucially in robustness and generalization under severe domain shifts, noise, and occlusion. Extensive experiments demonstrate our superiority across diverse sensor types and challenging real-world scenarios.

Sparkle: A Robust and Versatile Representation for Point Cloud based Human Motion Capture

Abstract

Point cloud-based motion capture leverages rich spatial geometry and privacy-preserving sensing, but learning robust representations from noisy, unstructured point clouds remains challenging. Existing approaches face a struggle trade-off between point-based methods (geometrically detailed but noisy) and skeleton-based ones (robust but oversimplified). We address the fundamental challenge: how to construct an effective representation for human motion capture that can balance expressiveness and robustness. In this paper, we propose Sparkle, a structured representation unifying skeletal joints and surface anchors with explicit kinematic-geometric factorization. Our framework, SparkleMotion, learns this representation through hierarchical modules embedding geometric continuity and kinematic constraints. By explicitly disentangling internal kinematic structure from external surface geometry, SparkleMotion achieves state-of-the-art performance not only in accuracy but crucially in robustness and generalization under severe domain shifts, noise, and occlusion. Extensive experiments demonstrate our superiority across diverse sensor types and challenging real-world scenarios.

Paper Structure

This paper contains 38 sections, 21 equations, 6 figures, 12 tables.

Figures (6)

  • Figure 2: The pipeline of SparkleMotion. It can take point clouds of diverse patterns as input in different challenge scenarios, as shown on the left. SparkleMotion consists of three primary modules, the Point-aligned Skeleton Tracker, and Skeleton-guided Anchor Estimator construct the Sparkle Representation, and the Sparkle-based SMPL Solver for motion reconstruction.
  • Figure 3: Qualitative comparisons. Due to the natural spatial information of point clouds, we simultaneously displayed both point clouds and global human mesh to reflect the accuracy. For multi-view MoCap, we present point clouds from three perspectives and unique fusion results from multiple perspectives. The red rounded rectangle indicates where other methods did not work correctly.
  • Figure 4: The details of Point-aligned Skeleton Tracker.
  • Figure 5: The details of Skeleton-guided Anchor Estimator.
  • Figure 6: The pipeline of the multi-view SparkleMotion. Our method can take point clouds from any perspective as input to obtain optimized results, demonstrating the powerful scalability of Sparkle.
  • ...and 1 more figures