Table of Contents
Fetching ...

HumanReg: Self-supervised Non-rigid Registration of Human Point Cloud

Yifan Chen, Zhiyu Pan, Zhicheng Zhong, Wenxuan Guo, Jianjiang Feng, Jie Zhou

TL;DR

HumanReg tackles non-rigid registration of sparse outdoor human point clouds by jointly estimating per-point scene flow and body-part segmentation, guided by a body prior. The method is pretrained on a new synthetic dataset, HumanSyn4D, and then finetuned on real data with a self-supervised loss comprising Chamfer, smoothness, clustering, and a part-rigid term, enabling effective learning without dense ground-truth annotations. A key innovation is the part-rigid loss, which regularizes each body-part warp as near-rigid, and a soft correspondence mechanism that leverages body-part-aware features. Empirically, HumanReg achieves state-of-the-art results on CAPE-512 and qualitative improvements on BasketballPlayer, with ablations confirming the benefit of synthetic pretraining and the proposed losses.

Abstract

In this paper, we present a novel registration framework, HumanReg, that learns a non-rigid transformation between two human point clouds end-to-end. We introduce body prior into the registration process to efficiently handle this type of point cloud. Unlike most exsisting supervised registration techniques that require expensive point-wise flow annotations, HumanReg can be trained in a self-supervised manner benefiting from a set of novel loss functions. To make our model better converge on real-world data, we also propose a pretraining strategy, and a synthetic dataset (HumanSyn4D) consists of dynamic, sparse human point clouds and their auto-generated ground truth annotations. Our experiments shows that HumanReg achieves state-of-the-art performance on CAPE-512 dataset and gains a qualitative result on another more challenging real-world dataset. Furthermore, our ablation studies demonstrate the effectiveness of our synthetic dataset and novel loss functions. Our code and synthetic dataset is available at https://github.com/chenyifanthu/HumanReg.

HumanReg: Self-supervised Non-rigid Registration of Human Point Cloud

TL;DR

HumanReg tackles non-rigid registration of sparse outdoor human point clouds by jointly estimating per-point scene flow and body-part segmentation, guided by a body prior. The method is pretrained on a new synthetic dataset, HumanSyn4D, and then finetuned on real data with a self-supervised loss comprising Chamfer, smoothness, clustering, and a part-rigid term, enabling effective learning without dense ground-truth annotations. A key innovation is the part-rigid loss, which regularizes each body-part warp as near-rigid, and a soft correspondence mechanism that leverages body-part-aware features. Empirically, HumanReg achieves state-of-the-art results on CAPE-512 and qualitative improvements on BasketballPlayer, with ablations confirming the benefit of synthetic pretraining and the proposed losses.

Abstract

In this paper, we present a novel registration framework, HumanReg, that learns a non-rigid transformation between two human point clouds end-to-end. We introduce body prior into the registration process to efficiently handle this type of point cloud. Unlike most exsisting supervised registration techniques that require expensive point-wise flow annotations, HumanReg can be trained in a self-supervised manner benefiting from a set of novel loss functions. To make our model better converge on real-world data, we also propose a pretraining strategy, and a synthetic dataset (HumanSyn4D) consists of dynamic, sparse human point clouds and their auto-generated ground truth annotations. Our experiments shows that HumanReg achieves state-of-the-art performance on CAPE-512 dataset and gains a qualitative result on another more challenging real-world dataset. Furthermore, our ablation studies demonstrate the effectiveness of our synthetic dataset and novel loss functions. Our code and synthetic dataset is available at https://github.com/chenyifanthu/HumanReg.
Paper Structure (16 sections, 16 equations, 7 figures, 4 tables)

This paper contains 16 sections, 16 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: HumanReg overview. The proposed HumanReg framework takes a pair of human point clouds as input, simultaneously estimates the body-part segmentation for each point cloud and the scene flow between them. HumanReg can be pretrained on our synthetic dataset using ground-truth annotations, then adapted to unlabeled real-world data with our proposed self-supervised loss.
  • Figure 2: Training pipeline of our proposed method. Given the input human point clouds $\mathbf{P}$ and $\mathbf{Q}$, the 3D ResUNet backbone extracts per-point features, which are then processed by a segmentation head and a correspondence head (Sec. \ref{['sec:model-architecture']}). The two heads simultaneously output body-part segmentation of each point cloud and the soft correspondence between them. Our model is firstly pretrained on synthetic dataset with ground-truth labels and flow (Sec. \ref{['sec:supervised-pretraining']}). Then, a set of self-supervised loss functions (Sec. \ref{['sec:self-supervised-loss']}) are applied based on the estimation of both heads when finetuning on real-world data.
  • Figure 3: Rigid fitting for body part. We assume that the warp field of each body part is close to a rigid transformation. This assumption is used to design our part-rigid loss and refine flow during test time.
  • Figure 4: A snapshot of HumanSyn4D. Top Left: Top view of the synthetic scene, represents simulated LiDAR and green lines are the boundaries of the field. Bottom Left: Scanned human point clouds in one frame. Top Right: Ground-truth template mesh vertices. Bottom Right: Ground-truth human pose of each person.
  • Figure 5: Histogram of numbers of points in HumanSyn4D and BasketballPlayer.
  • ...and 2 more figures