PointInfinity: Resolution-Invariant Point Diffusion Models
Zixuan Huang, Justin Johnson, Shoubhik Debnath, James M. Rehg, Chao-Yuan Wu
TL;DR
PointInfinity presents a resolution-invariant diffusion framework for RGB-D point clouds that trains at low resolution yet generates high-resolution outputs at test time. The core idea is a two-stream transformer with a fixed-size latent surface representation and a variable-sized data stream, enabling efficient training and scalable high-resolution generation without upsampling modules. Test-time resolution scaling yields higher surface fidelity and links to classifier-free guidance, achieving state-of-the-art results on CO3D with outputs up to 131k points while maintaining favorable compute and memory characteristics. This method significantly advances scalable, high-quality 3D point cloud generation and offers insights into the fidelity-variability trade-offs in diffusion-based generation.
Abstract
We present PointInfinity, an efficient family of point cloud diffusion models. Our core idea is to use a transformer-based architecture with a fixed-size, resolution-invariant latent representation. This enables efficient training with low-resolution point clouds, while allowing high-resolution point clouds to be generated during inference. More importantly, we show that scaling the test-time resolution beyond the training resolution improves the fidelity of generated point clouds and surfaces. We analyze this phenomenon and draw a link to classifier-free guidance commonly used in diffusion models, demonstrating that both allow trading off fidelity and variability during inference. Experiments on CO3D show that PointInfinity can efficiently generate high-resolution point clouds (up to 131k points, 31 times more than Point-E) with state-of-the-art quality.
