Table of Contents
Fetching ...

Families of costs with zero and nonnegative MTW tensor in optimal transport and the c-divergences

Du Nguyen

TL;DR

The work develops a bridge between information geometry and optimal transport by studying $\mathsf{c}$-divergences from costs of the form $\mathsf{c}(x,\bar{x})=\mathsf{u}(x^{\mathfrak{t}}\bar{x})$, deriving an explicit MTW cross-curvature formula under the Kim–McCann metric and showing the zero-MTW case reduces to a linear ODE with Lambert/inverse-hyperbolic solutions. It extends the analysis to the sphere and hyperboloid models via Gauss–Codazzi, obtaining new families of strictly regular costs and a structured $\mathsf{c}$-divergence geometry, including dualistic connections and $\mathsf{c}$-Crouzeix identities. The paper then develops practical applications, notably a hyperbolic mirror sampling approach for the multivariate $t$-distribution and a local divergences framework on probability simplices and latent spaces, illustrating how non-classical costs can enhance sampling and representation in high dimensions. Overall, the results enrich both OT regularity theory and information-geometric divergences, with potential impact on hyperbolic embeddings, sampling algorithms, and latent-space regularization in machine learning.

Abstract

We study the information geometry of $\bcc$-divergences from families of costs of the form $\mathsf{c}(x, \barx) =\mathsf{u}(x^{\mathfrak{t}}\barx)$ through the optimal transport point of view. Here, $\mathsf{u}$ is a scalar function with inverse $\mathsf{s}$, $x^{\ft}\barx$ is a nondegenerate bilinear pairing of vectors $x, \barx$ belonging to an open subset of $\mathbb{R}^n$. We compute explicitly the MTW tensor (or cross curvature) for the optimal transport problem on $\mathbb{R}^n$ with this cost. The condition that the MTW-tensor vanishes on null vectors under the Kim-McCann metric is a fourth-order nonlinear ODE, which could be reduced to a linear ODE of the form $\mathsf{s}^{(2)} - S\mathsf{s}^{(1)} + P\mathsf{s} = 0$ with constant coefficients $P$ and $S$. The resulting inverse functions include {\it Lambert} and {\it generalized inverse hyperbolic\slash trigonometric} functions. The square Euclidean metric and $\log$-type costs are equivalent to instances of these solutions. The optimal map may be written explicitly in terms of the potential function. For cost functions of a similar form on a hyperboloid model of the hyperbolic space and unit sphere, we also express this tensor in terms of algebraic expressions in derivatives of $\mathsf{s}$ using the Gauss-Codazzi equation, obtaining new families of strictly regular costs for these manifolds, including new families of {\it power function costs}. We express the divergence geometry of the $\mathsf{c}$-divergence in terms of the Kim-McCann metric, including a $\mathsf{c}$-Crouzeix identity and a formula for the primal connection. We analyze the $\sinh$-type hyperbolic cost, providing examples of $\mathsf{c}$-convex functions, which are used to construct a new \emph{local form} of the $α$-divergences on probability simplices. We apply the optimal maps to sample the multivariate $t$-distribution.

Families of costs with zero and nonnegative MTW tensor in optimal transport and the c-divergences

TL;DR

The work develops a bridge between information geometry and optimal transport by studying -divergences from costs of the form , deriving an explicit MTW cross-curvature formula under the Kim–McCann metric and showing the zero-MTW case reduces to a linear ODE with Lambert/inverse-hyperbolic solutions. It extends the analysis to the sphere and hyperboloid models via Gauss–Codazzi, obtaining new families of strictly regular costs and a structured -divergence geometry, including dualistic connections and -Crouzeix identities. The paper then develops practical applications, notably a hyperbolic mirror sampling approach for the multivariate -distribution and a local divergences framework on probability simplices and latent spaces, illustrating how non-classical costs can enhance sampling and representation in high dimensions. Overall, the results enrich both OT regularity theory and information-geometric divergences, with potential impact on hyperbolic embeddings, sampling algorithms, and latent-space regularization in machine learning.

Abstract

We study the information geometry of -divergences from families of costs of the form through the optimal transport point of view. Here, is a scalar function with inverse , is a nondegenerate bilinear pairing of vectors belonging to an open subset of . We compute explicitly the MTW tensor (or cross curvature) for the optimal transport problem on with this cost. The condition that the MTW-tensor vanishes on null vectors under the Kim-McCann metric is a fourth-order nonlinear ODE, which could be reduced to a linear ODE of the form with constant coefficients and . The resulting inverse functions include {\it Lambert} and {\it generalized inverse hyperbolic\slash trigonometric} functions. The square Euclidean metric and -type costs are equivalent to instances of these solutions. The optimal map may be written explicitly in terms of the potential function. For cost functions of a similar form on a hyperboloid model of the hyperbolic space and unit sphere, we also express this tensor in terms of algebraic expressions in derivatives of using the Gauss-Codazzi equation, obtaining new families of strictly regular costs for these manifolds, including new families of {\it power function costs}. We express the divergence geometry of the -divergence in terms of the Kim-McCann metric, including a -Crouzeix identity and a formula for the primal connection. We analyze the -type hyperbolic cost, providing examples of -convex functions, which are used to construct a new \emph{local form} of the -divergences on probability simplices. We apply the optimal maps to sample the multivariate -distribution.
Paper Structure (19 sections, 16 theorems, 135 equations, 2 figures, 3 tables)

This paper contains 19 sections, 16 theorems, 135 equations, 2 figures, 3 tables.

Key Result

Theorem 1

For $\mathrm{q}=(x, \bar{x})\in \mathcal{N}$ and for a vector $\mathring{\omega}=(\omega, \bar{\omega})\in \mathbb{R}^n\times\mathbb{R}^n$, set $u = \mathsf{u}(x^{\mathfrak{t}}\bar{x})$ and $s_i = \mathsf{s}_i(u)$. Then the cross-curvature (MTW-tensor) of the cost $\mathsf{c}: (x, \bar{x}) \mapsto \ Thus, if $\mathring{\omega}$ is a null vector ($\langle \mathring{\omega}, \mathring{\omega}\rangle

Figures (2)

  • Figure 1: Optimal map and $\mathsf{c}$-conjugate functions under hyperbolic cost $-\frac{1}{r}\mathop{\mathrm{arcsinh}}\nolimits(rx^{\mathsf{T}}\bar{x})$. Top: hyperbolic conjugates for different $r$ together with the Legendre conjugate. Bottom: conjugates and optimal map for different $r$. The ranges of $\bar{x}$ correspond to the portion of the $x$-grid inside the domain of $\mathbf{T}$.
  • Figure 2: Transporting $t$-distributions using hyperbolic costs and quadratic potentials $\phi(x)=x^2, |x| < r^{-\frac{1}{2}}$. The graphs of $e^{-W(y)}$ are on the left and of $e^{-V_{adj}(x)}$ are on the right. For the smaller $r=0.5$, the transported distribution is close to a uniform density, while the transported $r=1.$ has two peaks near the boundary points. For $\nu=10$, the densities at the two ends of $[-10,10]$ map close to the zero in the transported distribution on $[-r^{-\frac{1}{2}}, r^{-\frac{1}{2}}]$, while for the fatter-tailed $\nu=3$, even the points $\pm 50$ map further from zero in the transported distribution.

Theorems & Definitions (30)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Proposition 5
  • Proposition 6
  • proof
  • proof
  • proof
  • Proposition 7
  • ...and 20 more