TRADE: Transfer of Distributions between External Conditions with Normalizing Flows
Stefan Wahl, Armand Rousselot, Felix Draxler, Henrik Schopmans, Ullrich Köthe
TL;DR
TRADE introduces a boundary-value PDE framework for learning conditional distributions $p_ heta(x|c)$ across external parameters by anchoring at a reference condition $c_0$ with $p_ heta(x|c_0)=p(x|c_0)$ and propagating the solution using the gradient constraint $rac{ m d}{ m d c}p_ heta(x|c)=rac{ m d}{ m d c}p(x|c)$. The key insight is to express $rac{ m d}{ m d c} ext{log }p(x|c)$ in terms of the unnormalized density $q(x|c)$ as $rac{ m d}{ m d c} ext{log }p(x|c)=rac{ m d}{ m d c} ext{log }q(x|c)- ext{E}_{p(x|c)}ig[rac{ m d}{ m d c} ext{log }q(x|c)ig]$, enabling a tractable, physics-informed loss that combines a boundary term with a gradient-consistency term. TRADE supports data-free training and energy-free variants, uses self-normalized importance sampling to estimate necessary expectations, and can discretize or continuously sample the conditioning space. Empirically, it outperforms baselines on multidimensional wells, tempered Bayesian inference, alanine dipeptide temperature transfer, and scalar-field lattice models, demonstrating robust generalization across a wide range of external parameters and problem domains. The approach promises practical impact for simulations and inference tasks where sampling across parameter regimes is expensive or infeasible, providing a flexible, stable alternative to energy-based or heavily restricted architectures.
Abstract
Modeling distributions that depend on external control parameters is a common scenario in diverse applications like molecular simulations, where system properties like temperature affect molecular configurations. Despite the relevance of these applications, existing solutions are unsatisfactory as they require severely restricted model architectures or rely on energy-based training, which is prone to instability. We introduce TRADE, which overcomes these limitations by formulating the learning process as a boundary value problem. By initially training the model for a specific condition using either i.i.d.~samples or backward KL training, we establish a boundary distribution. We then propagate this information across other conditions using the gradient of the unnormalized density with respect to the external parameter. This formulation, akin to the principles of physics-informed neural networks, allows us to efficiently learn parameter-dependent distributions without restrictive assumptions. Experimentally, we demonstrate that TRADE achieves excellent results in a wide range of applications, ranging from Bayesian inference and molecular simulations to physical lattice models.
