Matching the Optimal Denoiser in Point Cloud Diffusion with (Improved) Rotational Alignment
Ameya Daigavane, YuQing Xie, Bodhi P. Vani, Saeed Saremi, Joseph Kleinhenz, Tess Smidt
TL;DR
This work addresses orientation ambiguity in diffusion-based generation of 3D point clouds by linking the optimal denoiser under rotational augmentation to a matrix Fisher distribution on $SO(3)$. The authors show that alignment to the orientation mode $R^*(y,x)$ is the zeroth-order small-noise approximation of the optimal conditional denoiser, and they derive higher-order corrections $D^*_1$ and $D^*_2$ via Laplace expansions with no extra computational cost. Empirical results indicate that rotation alignment is a robust and often sufficient approximation for the noise regimes used in training, while the proposed higher-order estimators can modestly reduce bias at larger noise levels. This work provides a principled framework for understanding and improving rotationally invariant diffusion on 3D point clouds, with practical implications for molecular conformation modeling and related geometric data tasks.
Abstract
Diffusion models are a popular class of generative models trained to reverse a noising process starting from a target data distribution. Training a diffusion model consists of learning how to denoise noisy samples at different noise levels. When training diffusion models for point clouds such as molecules and proteins, there is often no canonical orientation that can be assigned. To capture this symmetry, the true data samples are often augmented by transforming them with random rotations sampled uniformly over $SO(3)$. Then, the denoised predictions are often rotationally aligned via the Kabsch-Umeyama algorithm to the ground truth samples before computing the loss. However, the effect of this alignment step has not been well studied. Here, we show that the optimal denoiser can be expressed in terms of a matrix Fisher distribution over $SO(3)$. Alignment corresponds to sampling the mode of this distribution, and turns out to be the zeroth order approximation for small noise levels, explaining its effectiveness. We build on this perspective to derive better approximators to the optimal denoiser in the limit of small noise. Our experiments highlight that alignment is often a `good enough' approximation for the noise levels that matter most for training diffusion models.
