Inverting Data Transformations via Diffusion Sampling
Jinwoo Kim, Sékou-Oumar Kaba, Jiyun Park, Seunghoon Hong, Siamak Ravanbakhsh
TL;DR
This paper introduces Transformation-Inverting Energy Diffusion (TIED), a diffusion-based sampler that operates on Lie groups to invert unknown data transformations given a data-space energy prior. By using a forward diffusion on the group and a reverse-time SDE with a trivialized Lie-algebra score, TIED stays on the manifold and leverages energy gradients to estimate the required scores. The method enables test-time equivariance for pretrained models and shows strong improvements on image homographies and Lie-point symmetry PDEs compared to baselines, all without training. This approach meaningfuly broadens probabilistic inversion to general transformation groups and offers a practical, training-free path to robust predictions in real-world settings.
Abstract
We study the problem of transformation inversion on general Lie groups: a datum is transformed by an unknown group element, and the goal is to recover an inverse transformation that maps it back to the original data distribution. Such unknown transformations arise widely in machine learning and scientific modeling, where they can significantly distort observations. We take a probabilistic view and model the posterior over transformations as a Boltzmann distribution defined by an energy function on data space. To sample from this posterior, we introduce a diffusion process on Lie groups that keeps all updates on-manifold and only requires computations in the associated Lie algebra. Our method, Transformation-Inverting Energy Diffusion (TIED), relies on a new trivialized target-score identity that enables efficient score-based sampling of the transformation posterior. As a key application, we focus on test-time equivariance, where the objective is to improve the robustness of pretrained neural networks to input transformations. Experiments on image homographies and PDE symmetries demonstrate that TIED can restore transformed inputs to the training distribution at test time, showing improved performance over strong canonicalization and sampling baselines. Code is available at https://github.com/jw9730/tied.
