SelfGeo: Self-supervised and Geodesic-consistent Estimation of Keypoints on Deformable Shapes
Mohammad Zohaib, Luca Cosmo, Alessio Del Bue
TL;DR
SelfGeo tackles unsupervised 3D keypoint estimation on deformable shapes by learning persistent, semantically meaningful keypoints from unlabelled PCD sequences. It introduces two complementary loss families: a Shape loss with reconstruction, coverage, and surface terms, and a Deformation loss with geodesic-distance preservation and temporal smoothing, enabling keypoints to move with deformations while staying on the surface. The method uses a PointNet++ backbone to predict per-point keypoint distributions and reconstructs the shape, achieving superior performance on CAPE, ITOP, and Deforming Things 4D, even under noisy or downsampled data. The approach yields stable, interpretable keypoints suitable for skeleton-like representations in AR/VR and robotics, without requiring ground-truth annotations, though geodesic estimation noise and symmetry remain challenging areas for refinement.
Abstract
Unsupervised 3D keypoints estimation from Point Cloud Data (PCD) is a complex task, even more challenging when an object shape is deforming. As keypoints should be semantically and geometrically consistent across all the 3D frames - each keypoint should be anchored to a specific part of the deforming shape irrespective of intrinsic and extrinsic motion. This paper presents, "SelfGeo", a self-supervised method that computes persistent 3D keypoints of non-rigid objects from arbitrary PCDs without the need of human annotations. The gist of SelfGeo is to estimate keypoints between frames that respect invariant properties of deforming bodies. Our main contribution is to enforce that keypoints deform along with the shape while keeping constant geodesic distances among them. This principle is then propagated to the design of a set of losses which minimization let emerge repeatable keypoints in specific semantic locations of the non-rigid shape. We show experimentally that the use of geodesic has a clear advantage in challenging dynamic scenes and with different classes of deforming shapes (humans and animals). Code and data are available at: https://github.com/IIT-PAVIS/SelfGeo
