Table of Contents
Fetching ...

Learning and Decision-Making with Data: Optimal Formulations and Phase Transitions

Amine Bennouna, Bart P. G. Van Parys

TL;DR

The paper develops a cohesive framework to design data-driven learning and decision-making formulations using historical data, balancing out-of-sample guarantees with accuracy of the estimated cost. It formalizes prediction and prescription problems through a meta-optimization that selects optimal predictors and prescriptors under a prescribed exponential-speed guarantee, revealing a phase transition across three regimes. In the exponential regime, KL-divergence based DRO is strongly optimal; in the superexponential regime, a fully robust but data-independent predictor prevails; and in the subexponential regime, a variance-penalized SVP predictor (with KL being asymptotically equivalent) is optimal and consistent. This unifies robust, KL-DRO, and variance-regularized approaches, shows their connections, and provides guidance for practitioners on which data-driven formulation to deploy given the desired out-of-sample reliability and data regime.

Abstract

We study the problem of designing optimal learning and decision-making formulations when only historical data is available. Prior work typically commits to a particular class of data-driven formulation and subsequently tries to establish out-of-sample performance guarantees. We take here the opposite approach. We define first a sensible yard stick with which to measure the quality of any data-driven formulation and subsequently seek to find an optimal such formulation. Informally, any data-driven formulation can be seen to balance a measure of proximity of the estimated cost to the actual cost while guaranteeing a level of out-of-sample performance. Given an acceptable level of out-of-sample performance, we construct explicitly a data-driven formulation that is uniformly closer to the true cost than any other formulation enjoying the same out-of-sample performance. We show the existence of three distinct out-of-sample performance regimes (a superexponential regime, an exponential regime and a subexponential regime) between which the nature of the optimal data-driven formulation experiences a phase transition. The optimal data-driven formulations can be interpreted as a classically robust formulation in the superexponential regime, an entropic distributionally robust formulation in the exponential regime and finally a variance penalized formulation in the subexponential regime. This final observation unveils a surprising connection between these three, at first glance seemingly unrelated, data-driven formulations which until now remained hidden.

Learning and Decision-Making with Data: Optimal Formulations and Phase Transitions

TL;DR

The paper develops a cohesive framework to design data-driven learning and decision-making formulations using historical data, balancing out-of-sample guarantees with accuracy of the estimated cost. It formalizes prediction and prescription problems through a meta-optimization that selects optimal predictors and prescriptors under a prescribed exponential-speed guarantee, revealing a phase transition across three regimes. In the exponential regime, KL-divergence based DRO is strongly optimal; in the superexponential regime, a fully robust but data-independent predictor prevails; and in the subexponential regime, a variance-penalized SVP predictor (with KL being asymptotically equivalent) is optimal and consistent. This unifies robust, KL-DRO, and variance-regularized approaches, shows their connections, and provides guidance for practitioners on which data-driven formulation to deploy given the desired out-of-sample reliability and data regime.

Abstract

We study the problem of designing optimal learning and decision-making formulations when only historical data is available. Prior work typically commits to a particular class of data-driven formulation and subsequently tries to establish out-of-sample performance guarantees. We take here the opposite approach. We define first a sensible yard stick with which to measure the quality of any data-driven formulation and subsequently seek to find an optimal such formulation. Informally, any data-driven formulation can be seen to balance a measure of proximity of the estimated cost to the actual cost while guaranteeing a level of out-of-sample performance. Given an acceptable level of out-of-sample performance, we construct explicitly a data-driven formulation that is uniformly closer to the true cost than any other formulation enjoying the same out-of-sample performance. We show the existence of three distinct out-of-sample performance regimes (a superexponential regime, an exponential regime and a subexponential regime) between which the nature of the optimal data-driven formulation experiences a phase transition. The optimal data-driven formulations can be interpreted as a classically robust formulation in the superexponential regime, an entropic distributionally robust formulation in the exponential regime and finally a variance penalized formulation in the subexponential regime. This final observation unveils a surprising connection between these three, at first glance seemingly unrelated, data-driven formulations which until now remained hidden.

Paper Structure

This paper contains 58 sections, 49 theorems, 299 equations, 7 figures, 1 table.

Key Result

proposition 1

The predictor $\cKL\in \cC$ verifies the out-of-sample guarantee eq: out-of-sample ganrantee when $a_T \sim rT$.

Figures (7)

  • Figure 1: Illustration of the distribution of relative optimality gap of SAA ($G(\hat{x}_{\rm{SAA}}(\hat{\Pb}_T))/\min_{x \in \mathcal{X}}c(x,\Pb)$) on a news vendor problem, under the randomness of the data. Here, $h = 12, b = 1$ and demand $\Tilde{d}$ follows a mixture of two Gaussians $\mathcal{N}(50,5)$ and $\mathcal{N}(100,5)$ with respective weights $0.1$ and $0.9$. The SAA solution is computed with $100$ demand data points, and the presented statistics are computed across $10^7$ random datasets generations. The orange box represents realizations of the optimality gap between quantiles at level $0.01$ and $0.99$. The expected value is at $5\%$. The red box represent values above the quantile at level $0.8$, which is associated with a $17\%$ relative optimality gap.
  • Figure 2: Each colored curve represents the nominal value of a predictor $\hat{c}(x,\Pb,T)$, for a fixed $T$, the black lower curve being the true cost $c(x,\Pb)$. The shaded region represents the random values of $\hat{c}(x,\hat{\Pb}_T,T)$ which occur with high probability $\sim 1-e^{-a_T}$. The predictor $\hat{c}_1$ on the left does not verify the out-of-sample guarantee as there is a set of probability larger than $e^{-a_T}$ where $\hat{c}_1(x,\hat{\Pb}_T,T)<c(x,\Pb)$ (the shaded region below the $c$ curve). The predictor $\hat{c}_2$ on the other hand verifies the out-of-sample as for all values of $\hat{\Pb}_T$ on the high probability set, $\hat{c}_2(x,\hat{\Pb}_T,T)\geq c(x,\Pb)$. The figure on the right illustrates the order $\preceq_{\cC}$. In the figure, $\hat{c}_1$ and $\hat{c}_2$ can not be compared as none is better uniformly than the other. The predictor $\hat{c}_3$ is uniformally closer to $c$ than $\hat{c}_1$ and $\hat{c}_2$, hence $\hat{c}_3 \preceq_{\cC} \hat{c}_1$ and $\hat{c}_3 \preceq_{\cC} \hat{c}_2$. Notice that as both our feasibility and order notions are asymptotic in $T$, the figure here is merely an illustration.
  • Figure 3: Illustration of the construction of a pathological predictor dominating a given predictor. The blue line represents the regular predictor $\hat{c}$, while the green line represents the perturbed predictor into a pathological predictor $\hat{c}'$. Here $i\in \Sigma$, $k\in \{0,\ldots,T\}$ is an integer, and the pointed vertical lines represent the possible values of the empirical distribution $\hat{\Pb}_T(i) \in \{\frac{0}{T}, \ldots, \frac{T}{T}\}$.
  • Figure 4: Illustration of the construction of $\hat{c}' \preceq_{\cC} \hat{c}$ and $\hat{c}' \not\equiv \hat{c}$ when $\hat{c}$ is not consistent.
  • Figure 5: Illustration of the DRO expression of the robust predictor in the subexponential regime. The shrinking ellipsoids around the blue points represent the ambiguity set of \ref{['eq: robust predictor DRO.']} around $\hat{\Pb}_T$ for increasing values of $T$. The arrow gives the cost at the pointed distribution, attaining the maximum cost in the ellipsoid.
  • ...and 2 more figures

Theorems & Definitions (126)

  • definition 1: Predictors
  • definition 2
  • remark 1: Oracles
  • definition 3: Regular Predictors
  • proposition 1: Feasibility
  • proof : Sketch of proof
  • theorem 1: Strong Optimality
  • remark 2
  • proposition 2: Feasibility
  • proof
  • ...and 116 more