Table of Contents
Fetching ...

Normalizing flow neural networks by JKO scheme

Chen Xu, Xiuyuan Cheng, Yao Xie

TL;DR

This work tackles the memory- and computation-heavy training of continuous normalizing flows by proposing JKO-iFlow, an invertible neural ODE flow that unfolds the Wasserstein gradient flow using the JKO scheme. Each residual block implements a JKO step, enabling block-wise training and avoiding SDE sampling or score matching, while an adaptive time reparameterization and trajectory refinement balance computational cost with accuracy. The method yields competitive likelihoods and samples on synthetic and real data, including high-dimensional tabular and image-like datasets, with significantly reduced memory footprints. By establishing invertible block-wise updates tied to the Fokker–Planck equation, JKO-iFlow offers a scalable, likelihood-based alternative to diffusion models and traditional CNFs, with promising directions for conditional generation and integration with varied backbone architectures.

Abstract

Normalizing flow is a class of deep generative models for efficient sampling and likelihood estimation, which achieves attractive performance, particularly in high dimensions. The flow is often implemented using a sequence of invertible residual blocks. Existing works adopt special network architectures and regularization of flow trajectories. In this paper, we develop a neural ODE flow network called JKO-iFlow, inspired by the Jordan-Kinderleherer-Otto (JKO) scheme, which unfolds the discrete-time dynamic of the Wasserstein gradient flow. The proposed method stacks residual blocks one after another, allowing efficient block-wise training of the residual blocks, avoiding sampling SDE trajectories and score matching or variational learning, thus reducing the memory load and difficulty in end-to-end training. We also develop adaptive time reparameterization of the flow network with a progressive refinement of the induced trajectory in probability space to improve the model accuracy further. Experiments with synthetic and real data show that the proposed JKO-iFlow network achieves competitive performance compared with existing flow and diffusion models at a significantly reduced computational and memory cost.

Normalizing flow neural networks by JKO scheme

TL;DR

This work tackles the memory- and computation-heavy training of continuous normalizing flows by proposing JKO-iFlow, an invertible neural ODE flow that unfolds the Wasserstein gradient flow using the JKO scheme. Each residual block implements a JKO step, enabling block-wise training and avoiding SDE sampling or score matching, while an adaptive time reparameterization and trajectory refinement balance computational cost with accuracy. The method yields competitive likelihoods and samples on synthetic and real data, including high-dimensional tabular and image-like datasets, with significantly reduced memory footprints. By establishing invertible block-wise updates tied to the Fokker–Planck equation, JKO-iFlow offers a scalable, likelihood-based alternative to diffusion models and traditional CNFs, with promising directions for conditional generation and integration with varied backbone architectures.

Abstract

Normalizing flow is a class of deep generative models for efficient sampling and likelihood estimation, which achieves attractive performance, particularly in high dimensions. The flow is often implemented using a sequence of invertible residual blocks. Existing works adopt special network architectures and regularization of flow trajectories. In this paper, we develop a neural ODE flow network called JKO-iFlow, inspired by the Jordan-Kinderleherer-Otto (JKO) scheme, which unfolds the discrete-time dynamic of the Wasserstein gradient flow. The proposed method stacks residual blocks one after another, allowing efficient block-wise training of the residual blocks, avoiding sampling SDE trajectories and score matching or variational learning, thus reducing the memory load and difficulty in end-to-end training. We also develop adaptive time reparameterization of the flow network with a progressive refinement of the induced trajectory in probability space to improve the model accuracy further. Experiments with synthetic and real data show that the proposed JKO-iFlow network achieves competitive performance compared with existing flow and diffusion models at a significantly reduced computational and memory cost.
Paper Structure (35 sections, 2 theorems, 31 equations, 34 figures, 6 tables, 1 algorithm)

This paper contains 35 sections, 2 theorems, 31 equations, 34 figures, 6 tables, 1 algorithm.

Key Result

Proposition 3.1

Given $p_{k}$, up to a constant $c$ independent from ${\bf f}(x ,t)$ on $t \in I_{k+1}$,

Figures (34)

  • Figure 1: JKO-iFlow
  • Figure 2: usual CNF
  • Figure 4: Diagram illustrating trajectory reparameterization and refinement. Top: the original trajectory under three blocks via Algorithm \ref{['block_training']}. Bottom: the trajectory under six blocks after reparameterization and refinement, which renders the $W_2$ movements more even.
  • Figure 5: True data JKO-iFlow $\tau$: 2.79e-4, MMD-c: 2.73e-4 NLL 2.64
  • Figure 6: FFJORD 3.88e-4 2.95
  • ...and 29 more figures

Theorems & Definitions (4)

  • Proposition 3.1
  • Lemma A.1
  • proof : Proof of Lemma \ref{['lemma:JKO-by-Tk']}
  • proof : Proof of Proposition \ref{['prop:FT-layer-wise']},