Table of Contents
Fetching ...

Decision-dependent distributionally robust standard quadratic optimization with Wasserstein ambiguity

Immanuel M. Bomze, Daniel de Vicente, Abdel Lisser, Heng Zhang

TL;DR

This paper will focus on distributionally robust StQPs under Wasserstein distance, and show equivalence to an accordingly modified deterministic instance of an StQP.

Abstract

The standard quadratic optimization problem (StQP) consists of minimizing a quadratic form over the standard simplex. Without assuming convexity or concavity of the quadratic form, the StQP is NP-hard. This problem has many interesting applications ranging from portfolio optimization to machine learning. Sometimes, the data matrix is uncertain but some information about its distribution can be inferred, e.g. a distance to a reference distribution (typically, the empirical distribution after sampling). In distributionally robust optimization, the goal is to hedge against the worst case of all possible distributions in an ambiguity set, defined by above mentioned distance. In this paper we will focus on distributionally robust StQPs under Wasserstein distance, and show equivalence to an accordingly modified deterministic instance of an StQP. This blends well into recent findings for other approaches of StQPs under uncertainty. We will also address out-of-sample performance guarantees. Carefully designed experiments shall complement and illustrate the approach.

Decision-dependent distributionally robust standard quadratic optimization with Wasserstein ambiguity

TL;DR

This paper will focus on distributionally robust StQPs under Wasserstein distance, and show equivalence to an accordingly modified deterministic instance of an StQP.

Abstract

The standard quadratic optimization problem (StQP) consists of minimizing a quadratic form over the standard simplex. Without assuming convexity or concavity of the quadratic form, the StQP is NP-hard. This problem has many interesting applications ranging from portfolio optimization to machine learning. Sometimes, the data matrix is uncertain but some information about its distribution can be inferred, e.g. a distance to a reference distribution (typically, the empirical distribution after sampling). In distributionally robust optimization, the goal is to hedge against the worst case of all possible distributions in an ambiguity set, defined by above mentioned distance. In this paper we will focus on distributionally robust StQPs under Wasserstein distance, and show equivalence to an accordingly modified deterministic instance of an StQP. This blends well into recent findings for other approaches of StQPs under uncertainty. We will also address out-of-sample performance guarantees. Carefully designed experiments shall complement and illustrate the approach.
Paper Structure (17 sections, 20 theorems, 120 equations, 2 figures)

This paper contains 17 sections, 20 theorems, 120 equations, 2 figures.

Key Result

Lemma 2.2

Let $\left\lVert\cdot\right\rVert = \left\lVert\cdot\right\rVert_2$ be the Euclidean norm and $p=2$. Let ${\mathbf z} \in \mathbb R^m$ be an estimator of the true mean ${\bm \mu}_{\rm true}$ and set For the 2-Wasserstein distance between the Dirac distribution $\delta_{{\mathbf z}}$ and the empirical distribution $\widehat{\mathbb P}_N$ we have by the familiar formula $\operatorname{MSE} = \ope

Figures (2)

  • Figure 1: Maximum clique solutions under varying noise levels $\beta$ and robustness radii $\theta$. The columns correspond to $\theta \in \{0.01, 0.6, 1.5\}$ and the rows correspond to $\beta \in \{0.01, 0.1, 0.8\}$. $W$ represents the weighted size of the solution clique. (Example \ref{['solutions to decision-independent case under different beta and theta']}).
  • Figure 2: Objective values, maximum clique weights, Graph density and running time for maximum clique solutions under different $\theta$. The trials size is $10$ including $20$ samples within each trial. The node size is $15$. $\beta = 0.001$. The range of $\theta$ is $[10^{-3}, 10]$. (a) Optimal objective value; (b) Maximum clique weight $W(S)$; (c) Graph density $\rho(G)$; (d) Solver runtime in seconds. (Example \ref{['Out-of-sample performance to decision-independent case under differential gamma']}).

Theorems & Definitions (53)

  • Remark 2.1
  • Lemma 2.2
  • proof
  • Proposition 2.3
  • Theorem 2.4
  • proof
  • Remark 2.5
  • Theorem 2.6
  • proof
  • Remark 2.7
  • ...and 43 more