Not-So-Optimal Transport Flows for 3D Point Cloud Generation
Ka-Hei Hui, Chao Liu, Xiaohui Zeng, Chi-Wing Fu, Arash Vahdat
TL;DR
This work tackles the challenge of scalable, permutation-invariant generation of 3D point clouds by scrutinizing equivariant OT flows and proposing a not-so-optimal transport flow matching (NSOT) approach. NSOT uses offline precomputation of OT between dense supersets $X_0, X_1 \in \mathbb{R}^{M\times 3}$ with a bijection $\Pi$, followed by online subsampling and a hybrid coupling that perturbs the noise to ease learning. Empirically, NSOT matches or surpasses diffusion-based and OT-flow baselines on ShapeNet for unconditional generation and shape completion, particularly at low inference budgets, demonstrating improved scalability and sample quality with fewer steps. The method offers a scalable alternative for large point clouds and points to future work on broader invariances, such as rotations, and extensions to richer point-cloud attributes and resolutions.
Abstract
Learning generative models of 3D point clouds is one of the fundamental problems in 3D generative learning. One of the key properties of point clouds is their permutation invariance, i.e., changing the order of points in a point cloud does not change the shape they represent. In this paper, we analyze the recently proposed equivariant OT flows that learn permutation invariant generative models for point-based molecular data and we show that these models scale poorly on large point clouds. Also, we observe learning (equivariant) OT flows is generally challenging since straightening flow trajectories makes the learned flow model complex at the beginning of the trajectory. To remedy these, we propose not-so-optimal transport flow models that obtain an approximate OT by an offline OT precomputation, enabling an efficient construction of OT pairs for training. During training, we can additionally construct a hybrid coupling by combining our approximate OT and independent coupling to make the target flow models easier to learn. In an extensive empirical study, we show that our proposed model outperforms prior diffusion- and flow-based approaches on a wide range of unconditional generation and shape completion on the ShapeNet benchmark.
