PointNorm-Net: Self-Supervised Normal Prediction of 3D Point Clouds via Multi-Modal Distribution Estimation
Jie Zhang, Minghui Nie, Changqing Zou, Jian Liu, Ligang Liu, Junjie Cao
TL;DR
The paper tackles the challenge of estimating surface normals on real-world 3D point clouds without ground-truth annotations, addressing the domain gap between synthetic training data and real scans. It introduces PointNorm-Net, a self-supervised framework that leverages a three-stage multimodal distribution estimation strategy to identify the major mode of candidate normals, enabling robust normal prediction even at sharp features. The method combines a patch-based normal predictor with a candidate-consensus training objective and custom losses, achieving superior generalization on real Kinect, LiDAR, and TLS datasets while remaining efficient at inference. Its ground-truth sampling theory and multi-sample consensus paradigm offer a general approach that can be integrated with optimization-based or learning-based normal estimation and extended to other self-supervised point-cloud tasks.
Abstract
Although supervised deep normal estimators have recently shown impressive results on synthetic benchmarks, their performance deteriorates significantly in real-world scenarios due to the domain gap between synthetic and real data. Building high-quality real training data to boost those supervised methods is not trivial because point-wise annotation of normals for varying-scale real-world 3D scenes is a tedious and expensive task. This paper introduces PointNorm-Net, the first self-supervised deep learning framework to tackle this challenge. The key novelty of PointNorm-Net is a three-stage multi-modal normal distribution estimation paradigm that can be integrated into either deep or traditional optimization-based normal estimation frameworks. Extensive experiments show that our method achieves superior generalization and outperforms state-of-the-art conventional and deep learning approaches across three real-world datasets that exhibit distinct characteristics compared to the synthetic training data.
