FrePolad: Frequency-Rectified Point Latent Diffusion for Point Cloud Generation
Chenliang Zhou, Fangcheng Zhong, Param Hanji, Zhilin Guo, Kyle Fogarty, Alejandro Sztrajman, Hongyun Gao, Cengiz Oztireli
TL;DR
FrePolad addresses the challenge of generating high-quality, diverse point clouds with flexible cardinality and efficient runtimes. It fuses a variational autoencoder with a latent diffusion model as a prior and introduces a frequency-rectification mechanism based on spherical harmonics to preserve high-frequency details during training. A continuous normalizing flow decoder and a two-stage training regime enable generation of point clouds of arbitrary size by modeling a distribution of points over latent shapes. Empirical results on ShapeNet show state-of-the-art quality and diversity with favorable computational efficiency, and ablations confirm that both spectral rectification and latent diffusion contribute materially to performance.
Abstract
We propose FrePolad: frequency-rectified point latent diffusion, a point cloud generation pipeline integrating a variational autoencoder (VAE) with a denoising diffusion probabilistic model (DDPM) for the latent distribution. FrePolad simultaneously achieves high quality, diversity, and flexibility in point cloud cardinality for generation tasks while maintaining high computational efficiency. The improvement in generation quality and diversity is achieved through (1) a novel frequency rectification via spherical harmonics designed to retain high-frequency content while learning the point cloud distribution; and (2) a latent DDPM to learn the regularized yet complex latent distribution. In addition, FrePolad supports variable point cloud cardinality by formulating the sampling of points as conditional distributions over a latent shape distribution. Finally, the low-dimensional latent space encoded by the VAE contributes to FrePolad's fast and scalable sampling. Our quantitative and qualitative results demonstrate FrePolad's state-of-the-art performance in terms of quality, diversity, and computational efficiency. Project page: https://chenliang-zhou.github.io/FrePolad/.
