Table of Contents
Fetching ...

Interaction-Force Transport Gradient Flows

Egor Gladin, Pavel Dvurechensky, Alexander Mielke, Jia-Jie Zhu

TL;DR

It is proved that the spherical IFT gradient flow enjoys the best of both worlds by providing the global exponential convergence guarantee for both the MMD and KL energy.

Abstract

This paper presents a new gradient flow dissipation geometry over non-negative and probability measures. This is motivated by a principled construction that combines the unbalanced optimal transport and interaction forces modeled by reproducing kernels. Using a precise connection between the Hellinger geometry and the maximum mean discrepancy (MMD), we propose the interaction-force transport (IFT) gradient flows and its spherical variant via an infimal convolution of the Wasserstein and spherical MMD tensors. We then develop a particle-based optimization algorithm based on the JKO-splitting scheme of the mass-preserving spherical IFT gradient flows. Finally, we provide both theoretical global exponential convergence guarantees and improved empirical simulation results for applying the IFT gradient flows to the sampling task of MMD-minimization. Furthermore, we prove that the spherical IFT gradient flow enjoys the best of both worlds by providing the global exponential convergence guarantee for both the MMD and KL energy.

Interaction-Force Transport Gradient Flows

TL;DR

It is proved that the spherical IFT gradient flow enjoys the best of both worlds by providing the global exponential convergence guarantee for both the MMD and KL energy.

Abstract

This paper presents a new gradient flow dissipation geometry over non-negative and probability measures. This is motivated by a principled construction that combines the unbalanced optimal transport and interaction forces modeled by reproducing kernels. Using a precise connection between the Hellinger geometry and the maximum mean discrepancy (MMD), we propose the interaction-force transport (IFT) gradient flows and its spherical variant via an infimal convolution of the Wasserstein and spherical MMD tensors. We then develop a particle-based optimization algorithm based on the JKO-splitting scheme of the mass-preserving spherical IFT gradient flows. Finally, we provide both theoretical global exponential convergence guarantees and improved empirical simulation results for applying the IFT gradient flows to the sampling task of MMD-minimization. Furthermore, we prove that the spherical IFT gradient flow enjoys the best of both worlds by providing the global exponential convergence guarantee for both the MMD and KL energy.
Paper Structure (13 sections, 6 theorems, 48 equations, 6 figures, 1 algorithm)

This paper contains 13 sections, 6 theorems, 48 equations, 6 figures, 1 algorithm.

Key Result

Corollary 3.1

Suppose $\int k_\sigma(x,\cdot )\ \mathrm{d} \mu =1$ and the kernel-weighted-measure converges to the Dirac measure $k_\sigma(x, \cdot)\ \mathrm{d} \mu \to\ \mathrm{d} \delta_x$ as the bandwidth $\sigma\to 0$. Then, the $\mathrm{IFT}$ gradient flow equation eq:ikw-gfe-unreg tends towards the WFR gra

Figures (6)

  • Figure 1: (Left) Wasserstein flow of the MMD energy arbel_maximum_2019. Some particles get stuck at points away from the target. (Right) $\mathrm{IFT}$ gradient flow (this paper) of the MMD energy. Particle mass is teleported to close to the target, avoiding local minima. Hollow circles indicate particles with zero mass. The red dots are the initial particles, and the green dots are the target distribution. See §\ref{['sec:numerical-example']} for more details.
  • Figure 2: Illustration of the $\mathrm{IFT}$ gradient flow. Atoms are subject to both the transport (Kantorovich) potential and the interaction (repulsive) force from other atoms.
  • Figure 3: Mean loss and standard deviation computed over 50 runs
  • Figure 4: Trajectory of a randomly selected subsample produced by different algorithms in the Gaussian target experiment. Color intensity indicates points' weights. The hollow dots indicate the particles that have already vanished.
  • Figure 5: Trajectory of a randomly selected subsample produced by different algorithms in the Gaussian mixture experiment. Color intensity indicates points' weights. The hollow dots indicate the particles that have already vanished.
  • ...and 1 more figures

Theorems & Definitions (13)

  • Corollary 3.1
  • Proposition 3.2: Spherical MMD and spherical $\mathrm{IFT}$ gradient flow equations
  • Theorem 3.3
  • Theorem 3.4
  • Theorem 3.5: Global exponential convergence of the $\mathrm{IFT}$ flow of the MMD energy
  • Proposition 3.6: Exponential convergence of the S$\mathrm{IFT}$ gradient flow of the KL divergence energy
  • proof : Proof of Proposition \ref{['prop:spherical-ours-gfe']}
  • proof : Proof of Theorem \ref{['prop:loj-ours']}
  • proof : Proof of Corollary \ref{['cor:kernel-approx-ours']}
  • proof : Proof of Theorem \ref{['thm:spherical-MMD-MMD-gfe']}
  • ...and 3 more