On the Existence of Optimal Transport Gradient for Learning Generative Models
Antoine Houdard, Arthur Leclaire, Nicolas Papadakis, Julien Rabin
TL;DR
The paper probes the existence of gradients for optimal transport costs in learning generative models, revealing that the standard envelope-based gradient can fail in unregularized OT frameworks. It shows that entropic regularization restores differentiability, provides an explicit gradient expression via the $c,\lambda$-transform and Kantorovich potentials, and proves $W_c^\lambda(\theta)$ is $C^1$ under mild conditions. To make the approach practical, it specializes to a semi-discrete setting with discrete data, deriving a tractable algorithm that updates the generator parameters using a stochastic gradient informed by $\psi^{c,\lambda}$ and data samples. Numerical experiments on synthetic examples and MNIST illustrate the method’s stability and its capacity to learn complex generative mappings, albeit with a trade-off between smoothing and fidelity controlled by the regularization parameter $\lambda$.
Abstract
The use of optimal transport cost for learning generative models has become popular with Wasserstein Generative Adversarial Networks (WGAN). Training of WGAN relies on a theoretical background: the calculation of the gradient of the optimal transport cost with respect to the generative model parameters. We first demonstrate that such gradient may not be defined, which can result in numerical instabilities during gradient-based optimization. We address this issue by stating a valid differentiation theorem in the case of entropic regularized transport and specify conditions under which existence is ensured. By exploiting the discrete nature of empirical data, we formulate the gradient in a semi-discrete setting and propose an algorithm for the optimization of the generative model parameters. Finally, we illustrate numerically the advantage of the proposed framework.
