MFM-point: Multi-scale Flow Matching for Point Cloud Generation
Petr Molodyk, Jaemoo Choi, David W. Romero, Ming-Yu Liu, Yongxin Chen
TL;DR
MFM-point introduces a scalable multi-scale flow-matching framework for point cloud generation that advances point-based methods to high-resolution and multi-category tasks. By designing geometry-preserving downsampling and distribution-aligned upsampling, and by enforcing cross-stage alignment with a principled FM objective, the method achieves best-in-class performance among point-based models and competitive results with representation-based approaches. The two-stage coarse-to-fine generation leverages independent flow models at each resolution, enabling efficient training and fast inference. Extensive ablations demonstrate the importance of the geometry-aware operators and the transition-time boundaries, while conditional generation experiments highlight the framework’s flexibility. Overall, MFM-point significantly improves scalability and fidelity for point cloud synthesis with practical relevance to 3D modeling and robotics.
Abstract
In recent years, point cloud generation has gained significant attention in 3D generative modeling. Among existing approaches, point-based methods directly generate point clouds without relying on other representations such as latent features, meshes, or voxels. These methods offer low training cost and algorithmic simplicity, but often underperform compared to representation-based approaches. In this paper, we propose MFM-Point, a multi-scale Flow Matching framework for point cloud generation that substantially improves the scalability and performance of point-based methods while preserving their simplicity and efficiency. Our multi-scale generation algorithm adopts a coarse-to-fine generation paradigm, enhancing generation quality and scalability without incurring additional training or inference overhead. A key challenge in developing such a multi-scale framework lies in preserving the geometric structure of unordered point clouds while ensuring smooth and consistent distributional transitions across resolutions. To address this, we introduce a structured downsampling and upsampling strategy that preserves geometry and maintains alignment between coarse and fine resolutions. Our experimental results demonstrate that MFM-Point achieves best-in-class performance among point-based methods and challenges the best representation-based methods. In particular, MFM-point demonstrates strong results in multi-category and high-resolution generation tasks.
