Table of Contents
Fetching ...

FALCON: Few-step Accurate Likelihoods for Continuous Flows

Danyal Rehman, Tara Akhound-Sadegh, Artem Gazizov, Yoshua Bengio, Alexander Tong

TL;DR

FALCON tackles the long-standing challenge of Boltzmann sampling by fusing few-step invertible flows with accurate, tractable likelihoods. By introducing an invertibility-based training objective and a discrete-time flow map, it enables fast sampling while preserving likelihood estimates necessary for SNIS corrections. Empirical results show FALCON outperforms both state-of-the-art continuous flows in scalability and discrete normalizing flows in sample quality, achieving two orders of magnitude speedups over equivalent CNFs. The approach leverages a diffusion-transformer backbone with efficient Jacobian operations and careful inference scheduling to balance accuracy and efficiency. This work expands the practicality of Boltzmann Generators for MD-scale problems and sets a foundation for broader applications in scientific computing that require fast, reliable likelihoods.

Abstract

Scalable sampling of molecular states in thermodynamic equilibrium is a long-standing challenge in statistical physics. Boltzmann Generators tackle this problem by pairing a generative model, capable of exact likelihood computation, with importance sampling to obtain consistent samples under the target distribution. Current Boltzmann Generators primarily use continuous normalizing flows (CNFs) trained with flow matching for efficient training of powerful models. However, likelihood calculation for these models is extremely costly, requiring thousands of function evaluations per sample, severely limiting their adoption. In this work, we propose Few-step Accurate Likelihoods for Continuous Flows (FALCON), a method which allows for few-step sampling with a likelihood accurate enough for importance sampling applications by introducing a hybrid training objective that encourages invertibility. We show FALCON outperforms state-of-the-art normalizing flow models for molecular Boltzmann sampling and is two orders of magnitude faster than the equivalently performing CNF model.

FALCON: Few-step Accurate Likelihoods for Continuous Flows

TL;DR

FALCON tackles the long-standing challenge of Boltzmann sampling by fusing few-step invertible flows with accurate, tractable likelihoods. By introducing an invertibility-based training objective and a discrete-time flow map, it enables fast sampling while preserving likelihood estimates necessary for SNIS corrections. Empirical results show FALCON outperforms both state-of-the-art continuous flows in scalability and discrete normalizing flows in sample quality, achieving two orders of magnitude speedups over equivalent CNFs. The approach leverages a diffusion-transformer backbone with efficient Jacobian operations and careful inference scheduling to balance accuracy and efficiency. This work expands the practicality of Boltzmann Generators for MD-scale problems and sets a foundation for broader applications in scientific computing that require fast, reliable likelihoods.

Abstract

Scalable sampling of molecular states in thermodynamic equilibrium is a long-standing challenge in statistical physics. Boltzmann Generators tackle this problem by pairing a generative model, capable of exact likelihood computation, with importance sampling to obtain consistent samples under the target distribution. Current Boltzmann Generators primarily use continuous normalizing flows (CNFs) trained with flow matching for efficient training of powerful models. However, likelihood calculation for these models is extremely costly, requiring thousands of function evaluations per sample, severely limiting their adoption. In this work, we propose Few-step Accurate Likelihoods for Continuous Flows (FALCON), a method which allows for few-step sampling with a likelihood accurate enough for importance sampling applications by introducing a hybrid training objective that encourages invertibility. We show FALCON outperforms state-of-the-art normalizing flow models for molecular Boltzmann sampling and is two orders of magnitude faster than the equivalently performing CNF model.

Paper Structure

This paper contains 67 sections, 5 theorems, 28 equations, 18 figures, 6 tables, 2 algorithms.

Key Result

Proposition 1

Let $u^\star_\theta$ be a minimizer of Eq. eq:average_velocity with respect to some $v$. Also, define the Jacobian of $X$ as $\mathbf{J}_{X} = \frac{\partial X}{\partial x_s}$, and the discrete flow map: Then, for sufficiently smooth $u^\star_\theta$ and $v$ and for any $(s, t) \in [0, 1]^2$,

Figures (18)

  • Figure 1: Flow map learns from biased data, with SNIS re-weighting generated samples consistent with the Boltzmann distribution, approaching equality with infinite samples under mild regularity conditions.
  • Figure 2: Performance-inference time comparison between NFs and CNFs for $10^4$ dipeptide samples.
  • Figure 3: True MD energy distribution with best FALCON unweighted and re-sampled proposals for alanine dipeptide (left), tri-alanine (center left), and alanine tetrapeptide (center right), and hexa-alanine (right).
  • Figure 4: Performance with additional samples.
  • Figure 5: Improved proposal and re-weighted sample energies with increased steps for alanine dipeptide.
  • ...and 13 more figures

Theorems & Definitions (7)

  • Proposition 1
  • Proposition 2
  • Proposition 2
  • Theorem 1: Picard-Lindelöf
  • proof
  • Proposition 2
  • proof