Table of Contents
Fetching ...

Robust Losses for Decision-Focused Learning

Noah Schutte, Krzysztof Postek, Neil Yorke-Smith

TL;DR

This work analyzes decision-focused learning (DFL) for optimization under uncertainty and identifies critical weaknesses of using empirical regret as a surrogate, especially under epistemic and aleatoric uncertainty. It introduces three robust losses—Robust Optimization (RO) Loss, Top-k Loss, and k-Nearest Neighbour (k-NN) Loss—that aim to better approximate the expected regret by stabilizing the estimator of the conditional mean costs and/or considering near-optimal decisions. The authors characterize applicability, gradient computations, and computational implications, showing that these losses can be integrated with SPO+ and PFYL without increasing per-epoch cost and with modest precomputation overhead. Empirical evaluation on shortest path, traveling salesperson, and energy-cost aware scheduling demonstrates that robust losses reduce test-sample regret more reliably than empirical regret, with k-NN often providing the strongest gains, particularly in noisy or data-scarce settings, indicating practical benefits for real-world decision-making under uncertainty.

Abstract

Optimization models used to make discrete decisions often contain uncertain parameters that are context-dependent and estimated through prediction. To account for the quality of the decision made based on the prediction, decision-focused learning (end-to-end predict-then-optimize) aims at training the predictive model to minimize regret, i.e., the loss incurred by making a suboptimal decision. Despite the challenge of the gradient of this loss w.r.t. the predictive model parameters being zero almost everywhere for optimization problems with a linear objective, effective gradient-based learning approaches have been proposed to minimize the expected loss, using the empirical loss as a surrogate. However, empirical regret can be an ineffective surrogate because empirical optimal decisions can vary substantially from expected optimal decisions. To understand the impact of this deficiency, we evaluate the effect of aleatoric and epistemic uncertainty on the accuracy of empirical regret as a surrogate. Next, we propose three novel loss functions that approximate expected regret more robustly. Experimental results show that training two state-of-the-art decision-focused learning approaches using robust regret losses improves test-sample empirical regret in general while keeping computational time equivalent relative to the number of training epochs.

Robust Losses for Decision-Focused Learning

TL;DR

This work analyzes decision-focused learning (DFL) for optimization under uncertainty and identifies critical weaknesses of using empirical regret as a surrogate, especially under epistemic and aleatoric uncertainty. It introduces three robust losses—Robust Optimization (RO) Loss, Top-k Loss, and k-Nearest Neighbour (k-NN) Loss—that aim to better approximate the expected regret by stabilizing the estimator of the conditional mean costs and/or considering near-optimal decisions. The authors characterize applicability, gradient computations, and computational implications, showing that these losses can be integrated with SPO+ and PFYL without increasing per-epoch cost and with modest precomputation overhead. Empirical evaluation on shortest path, traveling salesperson, and energy-cost aware scheduling demonstrates that robust losses reduce test-sample regret more reliably than empirical regret, with k-NN often providing the strongest gains, particularly in noisy or data-scarce settings, indicating practical benefits for real-world decision-making under uncertainty.

Abstract

Optimization models used to make discrete decisions often contain uncertain parameters that are context-dependent and estimated through prediction. To account for the quality of the decision made based on the prediction, decision-focused learning (end-to-end predict-then-optimize) aims at training the predictive model to minimize regret, i.e., the loss incurred by making a suboptimal decision. Despite the challenge of the gradient of this loss w.r.t. the predictive model parameters being zero almost everywhere for optimization problems with a linear objective, effective gradient-based learning approaches have been proposed to minimize the expected loss, using the empirical loss as a surrogate. However, empirical regret can be an ineffective surrogate because empirical optimal decisions can vary substantially from expected optimal decisions. To understand the impact of this deficiency, we evaluate the effect of aleatoric and epistemic uncertainty on the accuracy of empirical regret as a surrogate. Next, we propose three novel loss functions that approximate expected regret more robustly. Experimental results show that training two state-of-the-art decision-focused learning approaches using robust regret losses improves test-sample empirical regret in general while keeping computational time equivalent relative to the number of training epochs.
Paper Structure (23 sections, 1 theorem, 19 equations, 3 figures, 1 table)

This paper contains 23 sections, 1 theorem, 19 equations, 3 figures, 1 table.

Key Result

Theorem 1

Assume $|X| > 2$. Given some $z$, we have that $\forall x_h \in X_h, x_l \in X_l$: when $\lim_{\sigma_l^2 \to 0}$, where $x^* = \mathop{\mathrm{\arg\!\min}}\limits_{x \in X} \mathbb{E}_{c \sim \mathcal{C}_{z}}[c | z]$.

Figures (3)

  • Figure 1: Visualization of the empirical regret loss in DFL (left) and the robust losses in comparison (predictive pipeline is equal). The robust losses are constructed through using an optimization model (optimizer) that is robust against the mean estimation error and/or using a different mean estimator than empirically observed $c$. From left to right: empirical regret ($l_\text{emp}$), RO loss ($l_\text{RO}$), top-$k$ loss ($l_\text{top-$k$}$), $k$-NN loss ($l_\text{$k$-NN}$).
  • Figure 2: Observed coefficients $c$ (profit) per feature value $z$ (temperature) and optimal linear predictors $\hat{c} = f_{\theta}^*(z)$ according to different loss functions. The example problem is a coffee stand owner deciding what treat to make from their daily batch of chocolate. There are 3 possible weather related decisions: chocolate ice cream , chocolate cookies or hot chocolate . In short: $\max_{x} c^T x \text{ s.t. } x_i \in \{0, 1\}, \sum_i x_i = 1$. The linear predictors are the lines with corresponding colour and shade. The shaded area is considered to be sub-optimal. (It is determined assuming the expected value of $c$ is equal to a linear interpolation between the observed points closest to the middle.)
  • Figure 3: Test set mean normalized empirical regret in % (y-axes) on 3 experimental problems with different noise ($\bar{\epsilon}$) and training size ($t$) values. Approaches are denoted by patterns; used losses by colour (mean squared error is used for PFL). Error bars denote one in-sample standard deviation on both sides of the mean. Mean values of robust losses that are significantly different (paired $t$-test, $\alpha = 0.05$) from their empirical regret counterpart are denoted with $*$ (better) or $\times$ (worse). The results are shown as a table in the Appendix.

Theorems & Definitions (3)

  • Example 1
  • Theorem 1
  • proof