Table of Contents
Fetching ...

Not-So-Optimal Transport Flows for 3D Point Cloud Generation

Ka-Hei Hui, Chao Liu, Xiaohui Zeng, Chi-Wing Fu, Arash Vahdat

TL;DR

This work tackles the challenge of scalable, permutation-invariant generation of 3D point clouds by scrutinizing equivariant OT flows and proposing a not-so-optimal transport flow matching (NSOT) approach. NSOT uses offline precomputation of OT between dense supersets $X_0, X_1 \in \mathbb{R}^{M\times 3}$ with a bijection $\Pi$, followed by online subsampling and a hybrid coupling that perturbs the noise to ease learning. Empirically, NSOT matches or surpasses diffusion-based and OT-flow baselines on ShapeNet for unconditional generation and shape completion, particularly at low inference budgets, demonstrating improved scalability and sample quality with fewer steps. The method offers a scalable alternative for large point clouds and points to future work on broader invariances, such as rotations, and extensions to richer point-cloud attributes and resolutions.

Abstract

Learning generative models of 3D point clouds is one of the fundamental problems in 3D generative learning. One of the key properties of point clouds is their permutation invariance, i.e., changing the order of points in a point cloud does not change the shape they represent. In this paper, we analyze the recently proposed equivariant OT flows that learn permutation invariant generative models for point-based molecular data and we show that these models scale poorly on large point clouds. Also, we observe learning (equivariant) OT flows is generally challenging since straightening flow trajectories makes the learned flow model complex at the beginning of the trajectory. To remedy these, we propose not-so-optimal transport flow models that obtain an approximate OT by an offline OT precomputation, enabling an efficient construction of OT pairs for training. During training, we can additionally construct a hybrid coupling by combining our approximate OT and independent coupling to make the target flow models easier to learn. In an extensive empirical study, we show that our proposed model outperforms prior diffusion- and flow-based approaches on a wide range of unconditional generation and shape completion on the ShapeNet benchmark.

Not-So-Optimal Transport Flows for 3D Point Cloud Generation

TL;DR

This work tackles the challenge of scalable, permutation-invariant generation of 3D point clouds by scrutinizing equivariant OT flows and proposing a not-so-optimal transport flow matching (NSOT) approach. NSOT uses offline precomputation of OT between dense supersets with a bijection , followed by online subsampling and a hybrid coupling that perturbs the noise to ease learning. Empirically, NSOT matches or surpasses diffusion-based and OT-flow baselines on ShapeNet for unconditional generation and shape completion, particularly at low inference budgets, demonstrating improved scalability and sample quality with fewer steps. The method offers a scalable alternative for large point clouds and points to future work on broader invariances, such as rotations, and extensions to richer point-cloud attributes and resolutions.

Abstract

Learning generative models of 3D point clouds is one of the fundamental problems in 3D generative learning. One of the key properties of point clouds is their permutation invariance, i.e., changing the order of points in a point cloud does not change the shape they represent. In this paper, we analyze the recently proposed equivariant OT flows that learn permutation invariant generative models for point-based molecular data and we show that these models scale poorly on large point clouds. Also, we observe learning (equivariant) OT flows is generally challenging since straightening flow trajectories makes the learned flow model complex at the beginning of the trajectory. To remedy these, we propose not-so-optimal transport flow models that obtain an approximate OT by an offline OT precomputation, enabling an efficient construction of OT pairs for training. During training, we can additionally construct a hybrid coupling by combining our approximate OT and independent coupling to make the target flow models easier to learn. In an extensive empirical study, we show that our proposed model outperforms prior diffusion- and flow-based approaches on a wide range of unconditional generation and shape completion on the ShapeNet benchmark.

Paper Structure

This paper contains 26 sections, 3 theorems, 18 equations, 17 figures, 3 tables.

Key Result

Proposition 1

Given $(X_1, \cdots, X_n)$, which are independently and identically distributed (IID) real $d$-diemsnion random variables, following a probability distribution $p(X)$, i.e., $X_i \sim p(X), X \in \mathbb{R}^d$. We have an additional random variable $Y$ that is random uniform sample of these variable

Figures (17)

  • Figure 1: Different coupling types between Gaussian noise (left) and point clouds (right), where coupled noise and surface points share the same color: (a) Independent Coupling randomly maps noises to point clouds. (b) Minibatch OT computes OT map in batches of noises and point clouds. (c) Equivariant OT follows the similar minibatch OT but aligns points via permutation. (d) Our approach precomputes dense OT on data and noise supersets, then subsamples it to couple point clouds with slightly perturbed noise. Note that only (c) and (d) can produce high-quality OT.
  • Figure 2: In the OT flow model, the vector field ${\mathbf{v}}_t({\mathbf{x}}_0)$ admits a large change in its output with a small perturbation of ${\mathbf{x}}_0$ at $t{=}0$.
  • Figure 3: Comparison of OT Approximation Methods. Left: Average OT distance across batch sizes. Minibatch OT (blue) fails to reduce distances much compared to independent coupling (red dash). Equivariant OT (orange) significantly reduces distance values. Our OT approximation is on par with Equivariant OT. Right: Computational time for OT across batch sizes. Minibatch OT (blue) maintains a reasonable computational time ($\sim$1 second) with batch size $B=256$. Equivariant OT (orange) grows quadratically starting from 2.2 seconds with $B=1$.
  • Figure 4: Analysis of trajectory straightness using different couplings to obtain training pairs. Left: We plot the square norm of the difference between successive vector fields, i.e., $|| v_{\theta, t+1}(x_{t+1}) - v_{\theta, t}(x_t) ||$, as a measure of trajectory curvature. Right (a-c): Trajectory samples obtained by models trained with (a) independent coupling, (b) our OT approximation, and (c) our hybrid coupling with $\beta = 0.2$. Note that we subsample the point cloud to 30 points for a better trajectory visualization.
  • Figure 5: We show Jacobian Frobenius Norm for different trained ${\mathbf{v}}_{\theta, t}$ over different time intervals, which measures the model complexity as in dockhorn2022score.
  • ...and 12 more figures

Theorems & Definitions (3)

  • Proposition 1
  • Proposition 2
  • Theorem 1