Fourier Decomposition for Explicit Representation of 3D Point Cloud Attributes
Donghyun Kim, Hyunah Ko, Chanyoung Kim, Seong Jae Hwang
TL;DR
This paper tackles the challenge of effectively modeling colored 3D point clouds by introducing a Fourier-based encoding that disentangles color and geometry via amplitude $\mathcal{A}$ and phase $\mathcal{P}$ obtained from a 3D DFT on a voxelized grid with occupancy $\pi$. By leveraging spectral-domain operations, the method achieves a large receptive field and enables independent learning of color and geometry, leading to strong performance in classification and high-quality style transfer, as well as a simple yet effective data augmentation strategy based on amplitude swapping. The authors validate their approach on the DensePoint dataset, demonstrating state-of-the-art classification accuracy ($OA$ up to $98.43\%$) and compelling qualitative style-transfer results, supported by ablations and qualitative analyses. The work contributes a principled, geometry-color disentanglement framework for colored point clouds and showcases its practical impact across multiple 3D vision tasks, with potential for broader applications beyond recognition.
Abstract
While 3D point clouds are widely utilized across various vision applications, their irregular and sparse nature make them challenging to handle. In response, numerous encoding approaches have been proposed to capture the rich semantic information of point clouds. Yet, a critical limitation persists: a lack of consideration for colored point clouds which are more capable 3D representations as they contain diverse attributes: color and geometry. While existing methods handle these attributes separately on a per-point basis, this leads to a limited receptive field and restricted ability to capture relationships across multiple points. To address this, we pioneer a point cloud encoding methodology that leverages 3D Fourier decomposition to disentangle color and geometric features while extending the receptive field through spectral-domain operations. Our analysis confirms that this encoding approach effectively separates feature components, where the amplitude uniquely captures color attributes and the phase encodes geometric structure, thereby enabling independent learning and utilization of both attributes. Furthermore, the spectral-domain properties of these components naturally aggregate local features while considering multiple points' information. We validate our point cloud encoding approach on point cloud classification and style transfer tasks, achieving state-of-the-art results on the DensePoint dataset with improvements via a proposed amplitude-based data augmentation strategy.
