Free-form Flows: Make Any Architecture a Normalizing Flow
Felix Draxler, Peter Sorrenson, Lea Zimmermann, Armand Rousselot, Ullrich Köthe
TL;DR
Free-form Flows (FFF) remove the architectural invertibility constraints of traditional normalizing flows by training dimension-preserving networks with a fast maximum-likelihood surrogate and a reconstruction loss, enabling truly flexible inductive biases. The core idea leverages a gradient estimator for the log-determinant of the Jacobian via a trace trick and an inverse-function insight, replacing exact Jacobian determinants with efficient vector-Jacobian and Jacobian-vector products. Theoretical results show that minimizing the FFF objective yields the same global minima as classical ML when the reconstruction loss is small, and that the relaxed objective upper-bounds a spread KL divergence between data and model. Empirically, FFF matches or surpasses invertible-flow baselines on SBI and molecule-generation benchmarks, while offering much faster sampling and easier adaptation to domain-specific architectures, such as $E(n)$-equivariant networks for QM9. This framework broadens the applicability of likelihood-based generative modeling to diverse scientific problems by shifting focus from strict invertibility to task-tailored inductive biases and efficient gradient estimation.
Abstract
Normalizing Flows are generative models that directly maximize the likelihood. Previously, the design of normalizing flows was largely constrained by the need for analytical invertibility. We overcome this constraint by a training procedure that uses an efficient estimator for the gradient of the change of variables formula. This enables any dimension-preserving neural network to serve as a generative model through maximum likelihood training. Our approach allows placing the emphasis on tailoring inductive biases precisely to the task at hand. Specifically, we achieve excellent results in molecule generation benchmarks utilizing $E(n)$-equivariant networks. Moreover, our method is competitive in an inverse problem benchmark, while employing off-the-shelf ResNet architectures.
