Table of Contents
Fetching ...

Parameterized Wasserstein Gradient Flow

Yijie Jin, Shu Liu, Hao Wu, Xiaojing Ye, Haomin Zhou

TL;DR

This work introduces Parameterized Wasserstein Gradient Flow (PWGF), a scalable framework for solving high-dimensional Wasserstein gradient flows by parameterizing push-forward maps with neural networks and evolving a finite-dimensional parameter set. Central to PWGF is a new pullback Wasserstein metric that defines a tractable gradient flow on the parameter space: $\dot{\theta} = -\widehat{G}(\theta)^{\dagger} \nabla_{\theta} F(\theta)$, where $F(\theta)=\mathcal{F}(\rho_\theta)$ and $\rho_\theta = T_\theta\#\rho_{\text{ref}}$. The paper provides rigorous Wasserstein-distance error bounds between the PWGF approximation and the true WGF solution, develops a forward-Euler numerical scheme with a MINRES solver for the linear system, and demonstrates accuracy and computational efficiency on the Fokker–Planck equation, porous medium equation, and an aggregation model. The results show that the approach yields accurate density evolution and fast sampling in high dimensions, without spatial discretization or expensive nonconvex optimization, making it a practical tool for high-dimensional density evolution and related statistical applications.

Abstract

We develop a fast and scalable numerical approach to solve Wasserstein gradient flows (WGFs), particularly suitable for high-dimensional cases. Our approach is to use general reduced-order models, like deep neural networks, to parameterize the push-forward maps such that they can push a simple reference density to the one solving the given WGF. The new dynamical system is called parameterized WGF (PWGF), and it is defined on the finite-dimensional parameter space equipped with a pullback Wasserstein metric. Our numerical scheme can approximate the solutions of WGFs for general energy functionals effectively, without requiring spatial discretization or nonconvex optimization procedures, thus avoiding some limitations of classical numerical methods and more recent deep-learning-based approaches. A comprehensive analysis of the approximation errors measured by Wasserstein distance is also provided in this work. Numerical experiments show promising computational efficiency and verified accuracy on various WGF examples using our approach.

Parameterized Wasserstein Gradient Flow

TL;DR

This work introduces Parameterized Wasserstein Gradient Flow (PWGF), a scalable framework for solving high-dimensional Wasserstein gradient flows by parameterizing push-forward maps with neural networks and evolving a finite-dimensional parameter set. Central to PWGF is a new pullback Wasserstein metric that defines a tractable gradient flow on the parameter space: , where and . The paper provides rigorous Wasserstein-distance error bounds between the PWGF approximation and the true WGF solution, develops a forward-Euler numerical scheme with a MINRES solver for the linear system, and demonstrates accuracy and computational efficiency on the Fokker–Planck equation, porous medium equation, and an aggregation model. The results show that the approach yields accurate density evolution and fast sampling in high dimensions, without spatial discretization or expensive nonconvex optimization, making it a practical tool for high-dimensional density evolution and related statistical applications.

Abstract

We develop a fast and scalable numerical approach to solve Wasserstein gradient flows (WGFs), particularly suitable for high-dimensional cases. Our approach is to use general reduced-order models, like deep neural networks, to parameterize the push-forward maps such that they can push a simple reference density to the one solving the given WGF. The new dynamical system is called parameterized WGF (PWGF), and it is defined on the finite-dimensional parameter space equipped with a pullback Wasserstein metric. Our numerical scheme can approximate the solutions of WGFs for general energy functionals effectively, without requiring spatial discretization or nonconvex optimization procedures, thus avoiding some limitations of classical numerical methods and more recent deep-learning-based approaches. A comprehensive analysis of the approximation errors measured by Wasserstein distance is also provided in this work. Numerical experiments show promising computational efficiency and verified accuracy on various WGF examples using our approach.
Paper Structure (15 sections, 4 theorems, 64 equations, 8 figures, 1 algorithm)

This paper contains 15 sections, 4 theorems, 64 equations, 8 figures, 1 algorithm.

Key Result

Theorem 3.2

\newlabeltheorem: fi bound0 Suppose the potential $V$ can be decomposed as $V=U+\phi$ where $\nabla ^2U\succeq KI$ for some $K>0$ and $\phi \in L^{\infty}$. Denote $osc(\phi):=\sup \phi -\inf \phi<\infty$. Then the following logarithmic Sobolev inequality holds for any $\rho \in \mathcal{P}(M)$. Assume $\rho$ solves equation def: FPE with initial value $\rho(0, \cdot)=\rho_0(x)$, then

Figures (8)

  • Figure 1: Sample plots of computed $\rho_{\theta}$ at different time $t$ for Fokker-Planck equation with the Styblinski-Tang function as the potential function $V(x)$
  • Figure 2: The decay of KL divergence along the solution of PWGF for the Fokker-Planck equation with Styblinski-Tang potential in the $30$-dimensional example.
  • Figure 3: Sample plots of computed $\rho_{\theta}$ at different time $t$ for Porous Medium equation with Dirac Delta function as the initial condition for $d=2$ and $l=2.4$. The figures are plotted with $5000$ samples. In the level curves, darker colors correspond to smaller values to emphasize the support.
  • Figure 4: Sample plots of computed $\rho_{\theta}$ at different time $t$ for the porous medium equation with Dirac Delta function as the initial condition for $d=5$ and $l = 3$. The figures are plotted with $5000$ samples. In the level curves, darker colors correspond to smaller values to emphasize the support.
  • Figure 5: Sample plots of computed $\rho_{\theta}$ at different time $t$ for porous medium equation with Dirac Delta function as the initial condition for $d=15$. The figures are plotted with $5000$ samples. In the level curves, darker colors correspond to smaller values to emphasize the support.
  • ...and 3 more figures

Theorems & Definitions (10)

  • Example 3.1: Fokker-Planck equation
  • Theorem 3.2: Holley--Stroock perturbation Holley1987LogarithmicSI
  • Example 3.3: Porous medium equation
  • Example 3.4: Aggregation model
  • Proposition 3.5
  • Definition 3.6
  • Theorem 3.7
  • Proof 1
  • Theorem 3.8
  • Proof 2