PointCNN++: Performant Convolution on Native Points
Lihan Li, Haofeng Zhong, Rui Bu, Mingchao Sun, Wenzheng Chen, Baoquan Chen, Yangyan Li
TL;DR
PointCNN++ introduces a native-point convolution that centers receptive fields on original high-precision points and uses a local voxelization only for kernel mapping, thereby preserving geometric fidelity while achieving high efficiency. The method reframes convolution as a Matrix-Vector Multiplication and Reduction (MVMR) problem and implements highly optimized GPU kernels (MVMR and VVOR) with zero intermediate memory, enabling substantial memory savings and faster training than voxel-based approaches. Empirical results show sub-voxel registration gains and strong memory/latency advantages across multiple GPUs, with KITTI and 3DMatch benchmarks demonstrating state-of-the-art or competitive performance when used as a backbone. The work also provides extensive supplementary material, including Triton implementations and cross-GPU scalability analyses, and promises open-source release to accelerate adoption.
Abstract
Existing convolutional learning methods for 3D point cloud data are divided into two paradigms: point-based methods that preserve geometric precision but often face performance challenges, and voxel-based methods that achieve high efficiency through quantization at the cost of geometric fidelity. This loss of precision is a critical bottleneck for tasks such as point cloud registration. We propose PointCNN++, a novel architectural design that fundamentally mitigates this precision-performance trade-off. It $\textbf{generalizes sparse convolution from voxels to points}$, treating voxel-based convolution as a specialized, degraded case of our more general point-based convolution. First, we introduce a point-centric convolution where the receptive field is centered on the original, high-precision point coordinates. Second, to make this high-fidelity operation performant, we design a computational strategy that operates $\textbf{natively}$ on points. We formulate the convolution on native points as a Matrix-Vector Multiplication and Reduction (MVMR) problem, for which we develop a dedicated, highly-optimized GPU kernel. Experiments demonstrate that PointCNN++ $\textbf{uses an order of magnitude less memory and is several times faster}$ than representative point-based methods. Furthermore, when used as a simple replacement for the voxel-based backbones it generalizes, it $\textbf{significantly improves point cloud registration accuracies while proving both more memory-efficient and faster}$. PointCNN++ shows that preserving geometric detail and achieving high performance are not mutually exclusive, paving the way for a new class of 3D learning with high fidelity and efficiency. Our code will be open sourced.
