Table of Contents
Fetching ...

Nash Equilibria, Regularization and Computation in Optimal Transport-Based Distributionally Robust Optimization

Soroosh Shafiee, Liviu Aolaritei, Florian Dörfler, Daniel Kuhn

TL;DR

The paper advances distributionally robust optimization by integrating optimal transport-based ambiguity sets with Nash equilibrium analysis, showing that robustification induces both higher-order variation and Lipschitz regularization even when the transport cost is non-metric. It establishes conditions for the existence and computability of Nash equilibria, and demonstrates that the dual problem often reduces to finite convex programs under discrete reference distributions, enabling construction of least-favorable distributions. By connecting the c-transform to classical envelopes (Pasch-Hausdorff and Moreau), it provides dual perspectives and algorithmic paths for solving nonconvex DROs via gradient-based methods. The theoretical results are complemented by numerical experiments on DRO-SVMs and distributionally robust portfolio optimization, illustrating transferable adversarial samples and tangible performance improvements. Overall, the work unifies regularization and robustification in OT-based DRO, offering practical, scalable solutions for high-stakes decisions under distributional ambiguity.

Abstract

We study optimal transport-based distributionally robust optimization problems where a fictitious adversary, often envisioned as nature, can choose the distribution of the uncertain problem parameters by reshaping a prescribed reference distribution at a finite transportation cost. In this framework, we show that robustification is intimately related to various forms of variation and Lipschitz regularization even if the transportation cost function fails to be (some power of) a metric. We also derive conditions for the existence and the computability of a Nash equilibrium between the decision-maker and nature, and we demonstrate numerically that nature's Nash strategy can be viewed as a distribution that is supported on remarkably deceptive adversarial samples. Finally, we identify practically relevant classes of optimal transport-based distributionally robust optimization problems that can be addressed with efficient gradient descent algorithms even if the loss function or the transportation cost function are nonconvex (but not both at the same time).

Nash Equilibria, Regularization and Computation in Optimal Transport-Based Distributionally Robust Optimization

TL;DR

The paper advances distributionally robust optimization by integrating optimal transport-based ambiguity sets with Nash equilibrium analysis, showing that robustification induces both higher-order variation and Lipschitz regularization even when the transport cost is non-metric. It establishes conditions for the existence and computability of Nash equilibria, and demonstrates that the dual problem often reduces to finite convex programs under discrete reference distributions, enabling construction of least-favorable distributions. By connecting the c-transform to classical envelopes (Pasch-Hausdorff and Moreau), it provides dual perspectives and algorithmic paths for solving nonconvex DROs via gradient-based methods. The theoretical results are complemented by numerical experiments on DRO-SVMs and distributionally robust portfolio optimization, illustrating transferable adversarial samples and tangible performance improvements. Overall, the work unifies regularization and robustification in OT-based DRO, offering practical, scalable solutions for high-stakes decisions under distributional ambiguity.

Abstract

We study optimal transport-based distributionally robust optimization problems where a fictitious adversary, often envisioned as nature, can choose the distribution of the uncertain problem parameters by reshaping a prescribed reference distribution at a finite transportation cost. In this framework, we show that robustification is intimately related to various forms of variation and Lipschitz regularization even if the transportation cost function fails to be (some power of) a metric. We also derive conditions for the existence and the computability of a Nash equilibrium between the decision-maker and nature, and we demonstrate numerically that nature's Nash strategy can be viewed as a distribution that is supported on remarkably deceptive adversarial samples. Finally, we identify practically relevant classes of optimal transport-based distributionally robust optimization problems that can be addressed with efficient gradient descent algorithms even if the loss function or the transportation cost function are nonconvex (but not both at the same time).
Paper Structure (17 sections, 23 theorems, 109 equations, 3 figures)

This paper contains 17 sections, 23 theorems, 109 equations, 3 figures.

Key Result

Proposition 1

If Assumption assumption:continuity holds, then we have for any $\theta \in \Theta$ and $\varepsilon > 0$, where $\ell_{c}(\theta, \lambda, \hat{z}) = \sup_{z \in {\mathcal{Z}}:\,c(z,\hat{z})<\infty} \; \ell(\theta, z) - \lambda c(z, \hat{z})$.

Figures (3)

  • Figure 1: Different feature distributions in a distributionally robust support vector machine problem: Empirical distribution (left), a least favorable distribution obtained by perturbing a single sample in ${\mathcal{J}}_+$ (center), and the least favorable distribution obtained by perturbing all samples in ${\mathcal{J}}_+$ (right), using the 1-norm transportation cost (top), the 2-norm transportation cost (middle), and the $\infty$-norm transportation cost (bottom).
  • Figure 2: Comparison of worst-case and least favorable distributions for different values of $\varepsilon$. The number underneath each adversarial example indicates its probability mass as a percentage of $1/J$.
  • Figure 3: (Left) Average out-of-sample performance of the distributionally robust log-optimal portfolios as a function of $\varepsilon$. (Right) Distribution of the out-of-sample performance for $\varepsilon=0$ and $\varepsilon =10^{-2}$.

Theorems & Definitions (52)

  • Proposition 1: Strong duality
  • Example 1: Local Mahalanobis transportation cost
  • Example 2: Discrete metric
  • Lemma 1: Weak compactness of optimal transport ambiguity sets
  • proof
  • Lemma 2: Continuity properties of the expected loss
  • proof
  • Theorem 1: Minimax theorem
  • proof
  • Definition 1: Slater point
  • ...and 42 more