Table of Contents
Fetching ...

A Kinetic-Energy Perspective of Flow Matching

Ziyun Li, Huancheng Hu, Soon Hoe Lim, Xuyu Li, Fei Gao, Enmao Diao, Zezhen Ding, Michalis Vazirgiannis, Henrik Bostrom

TL;DR

This paper introduces Kinetic Path Energy (KPE), a per-sample diagnostic that integrates the squared velocity along a flow-based generation trajectory to quantify kinetic effort. It shows two robust correspondences: higher KPE aligns with stronger semantic fidelity and with trajectories ending in low-density regions of the data manifold, and it derives a theoretical energy–density relation under posterior dominance. A notable paradox is revealed: the closed-form empirical flow matching (EFM) solution can achieve substantially higher peak energy yet memorize training data, due to a terminal-energy blow-up governed by a 1/(1−t) factor. To address this, the authors propose Kinetic Trajectory Shaping (KTS), a training-free two-phase inference method that boosts early motion and soft-landings late to reduce memorization while improving generation quality, demonstrated on CelebA and ImageNet-256. The work uncovers a Goldilocks principle for kinetic energy in flow-based generation and highlights trajectory-level diagnostics as a powerful lens for understanding and controlling generative dynamics.

Abstract

Flow-based generative models can be viewed through a physics lens: sampling transports a particle from noise to data by integrating a time-varying velocity field, and each sample corresponds to a trajectory with its own dynamical effort. Motivated by classical mechanics, we introduce Kinetic Path Energy (KPE), an action-like, per-sample diagnostic that measures the accumulated kinetic effort along an Ordinary Differential Equation (ODE) trajectory. Empirically, KPE exhibits two robust correspondences: (i) higher KPE predicts stronger semantic fidelity; (ii) high-KPE trajectories terminate on low-density manifold frontiers. We further provide theoretical guarantees linking trajectory energy to data density. Paradoxically, this correlation is non-monotonic. At sufficiently high energy, generation can degenerate into memorization. Leveraging the closed-form of empirical flow matching, we show that extreme energies drive trajectories toward near-copies of training examples. This yields a Goldilocks principle and motivates Kinetic Trajectory Shaping (KTS), a training-free two-phase inference strategy that boosts early motion and enforces a late-time soft landing, reducing memorization and improving generation quality across benchmark tasks.

A Kinetic-Energy Perspective of Flow Matching

TL;DR

This paper introduces Kinetic Path Energy (KPE), a per-sample diagnostic that integrates the squared velocity along a flow-based generation trajectory to quantify kinetic effort. It shows two robust correspondences: higher KPE aligns with stronger semantic fidelity and with trajectories ending in low-density regions of the data manifold, and it derives a theoretical energy–density relation under posterior dominance. A notable paradox is revealed: the closed-form empirical flow matching (EFM) solution can achieve substantially higher peak energy yet memorize training data, due to a terminal-energy blow-up governed by a 1/(1−t) factor. To address this, the authors propose Kinetic Trajectory Shaping (KTS), a training-free two-phase inference method that boosts early motion and soft-landings late to reduce memorization while improving generation quality, demonstrated on CelebA and ImageNet-256. The work uncovers a Goldilocks principle for kinetic energy in flow-based generation and highlights trajectory-level diagnostics as a powerful lens for understanding and controlling generative dynamics.

Abstract

Flow-based generative models can be viewed through a physics lens: sampling transports a particle from noise to data by integrating a time-varying velocity field, and each sample corresponds to a trajectory with its own dynamical effort. Motivated by classical mechanics, we introduce Kinetic Path Energy (KPE), an action-like, per-sample diagnostic that measures the accumulated kinetic effort along an Ordinary Differential Equation (ODE) trajectory. Empirically, KPE exhibits two robust correspondences: (i) higher KPE predicts stronger semantic fidelity; (ii) high-KPE trajectories terminate on low-density manifold frontiers. We further provide theoretical guarantees linking trajectory energy to data density. Paradoxically, this correlation is non-monotonic. At sufficiently high energy, generation can degenerate into memorization. Leveraging the closed-form of empirical flow matching, we show that extreme energies drive trajectories toward near-copies of training examples. This yields a Goldilocks principle and motivates Kinetic Trajectory Shaping (KTS), a training-free two-phase inference strategy that boosts early motion and enforces a late-time soft landing, reducing memorization and improving generation quality across benchmark tasks.
Paper Structure (78 sections, 13 theorems, 156 equations, 17 figures, 4 tables, 1 algorithm)

This paper contains 78 sections, 13 theorems, 156 equations, 17 figures, 4 tables, 1 algorithm.

Key Result

Lemma 4.1

The closed-form empirical flow matching velocity admits the representation (Proof in Appendix subsec:velocity-score.) where $\alpha(t)=\frac{\dot\gamma(t)\sigma_t^2}{\gamma(t)(1-\gamma(t))}$ and $\beta(t)=\frac{\dot\gamma(t)}{\gamma(t)}$, with $\sigma_t^2=(1-\gamma(t))^2$.

Figures (17)

  • Figure 1: High-energy samples show clearer semantic cues. Paired samples from the same class on ImageNet-256 (CFG=4.0): top is high-energy (high KPE), bottom is low-energy (low KPE). High-energy samples exhibit more salient, class-specific attributes.
  • Figure 2: KPE correlates with semantic strength and discriminability across CFG scales. Box plots of (a) CLIP score and (b) CLIP margin for low/mid/high KPE (0--33%, 33--67%, 67--100%) at CFG 1.0/1.5/4.0. Both metrics increase with KPE (medians labeled).
  • Figure 3: Inverse KPE--density relation on 2D synthetic datasets. Each row corresponds to one distribution (dense_sparse, multiscale_clusters, sandwich). Columns (left$\to$right): training data distribution, FM generations, KPE vs. density strata, instantaneous power $\|v(t)\|^2$ over time, cumulative KPE. Across datasets, trajectories ending in low-density regions accumulate higher KPE (Mann-Whitney U (MWU) test $p<10^{-3}$); details in Appendix \ref{['app:synthetic-kpe-density-descriptions']}.
  • Figure 4: High-KPE samples lie in low-density regions. (a) On CIFAR-10 at 150 steps, the $\log(\text{density})$ surface (left) is anti-aligned with KPE (right): high density corresponds to low energy. (b) The top 10% KPE samples (overlaid) cluster in low-density areas, consistent with Theorem \ref{['prop:kpe-density']}.
  • Figure 5: Strong negative correlation between KPE and training density. Scatter plot of KPE versus training log-density on CIFAR-10 ($N=150$ steps, $n=2{,}000$ samples). Left: $k$-NN; right: KDE. Each point represents one generated sample; red line shows linear regression fit. Spearman correlations are strongly negative (k-NN: $\rho=-0.65$; KDE: $\rho=-0.64$), indicating a strong monotonic inverse relationship.
  • ...and 12 more figures

Theorems & Definitions (22)

  • Lemma 4.1: Score-Based Velocity Decomposition
  • Theorem 4.2: Energy-Density Relation
  • Remark 4.3: Explicit Constants and Integrated Form
  • Lemma 5.1: Terminal energy blow-up, informal version
  • Proposition 5.2: Extreme kinetic energy, informal version
  • Theorem 2.1: Instantaneous Energy vs. Mixture Density
  • Lemma 2.2: Closed-form optimal velocity
  • proof
  • Lemma 2.3: Local Gaussian approximation with quantitative constants
  • proof
  • ...and 12 more