Hierarchical Direction Perception via Atomic Dot-Product Operators for Rotation-Invariant Point Clouds Learning
Chenyu Hu, Xiaotong Li, Hao Zhu, Biao Hou
TL;DR
This work tackles the sensitivity of point-cloud representations to arbitrary 3D rotations by introducing DiPVNet, a direction-aware framework built on atomic dot-product operators. It jointly learns local directional cues via Learnable Local Dot-Product (L2DP) and global directional structure through the Direction-Aware Spherical Fourier Transform (DASFT), with cross-attention fusing invariant and canonical-projected equivariant features. The approach provides rotation invariance and adaptive directional perception across multiple scales, validated by state-of-the-art results on rotation-robust classification and segmentation benchmarks. Its combination of local invariants and global spectral cues offers a practical pathway to robust 3D perception under challenging pose variations, with potential applicability beyond point clouds to other geometric data modalities.
Abstract
Point cloud processing has become a cornerstone technology in many 3D vision tasks. However, arbitrary rotations introduce variations in point cloud orientations, posing a long-standing challenge for effective representation learning. The core of this issue is the disruption of the point cloud's intrinsic directional characteristics caused by rotational perturbations. Recent methods attempt to implicitly model rotational equivariance and invariance, preserving directional information and propagating it into deep semantic spaces. Yet, they often fall short of fully exploiting the multiscale directional nature of point clouds to enhance feature representations. To address this, we propose the Direction-Perceptive Vector Network (DiPVNet). At its core is an atomic dot-product operator that simultaneously encodes directional selectivity and rotation invariance--endowing the network with both rotational symmetry modeling and adaptive directional perception. At the local level, we introduce a Learnable Local Dot-Product (L2DP) Operator, which enables interactions between a center point and its neighbors to adaptively capture the non-uniform local structures of point clouds. At the global level, we leverage generalized harmonic analysis to prove that the dot-product between point clouds and spherical sampling vectors is equivalent to a direction-aware spherical Fourier transform (DASFT). This leads to the construction of a global directional response spectrum for modeling holistic directional structures. We rigorously prove the rotation invariance of both operators. Extensive experiments on challenging scenarios involving noise and large-angle rotations demonstrate that DiPVNet achieves state-of-the-art performance on point cloud classification and segmentation tasks. Our code is available at https://github.com/wxszreal0/DiPVNet.
