Locally Convex Global Loss Network for Decision-Focused Learning

Haeun Jeon; Hyunglip Bae; Minsu Park; Chanyeong Kim; Woo Chang Kim

Locally Convex Global Loss Network for Decision-Focused Learning

Haeun Jeon, Hyunglip Bae, Minsu Park, Chanyeong Kim, Woo Chang Kim

TL;DR

The paper tackles decision-focused learning under uncertainty by addressing the difficulty of differentiating through optimization. It introduces Locally Convex Global Loss Network (LCGLN), a global surrogate built with a Partial Input Convex Neural Network (PICNN) to ensure local convexity around chosen inputs while preserving a non-convex global structure, enabling end-to-end gradient-based training with a single surrogate loss. LCGLN is trained via model-based sampling to approximate the true decision loss, and its gradient signals are used to update predictive models across three stochastic decision problems, where it outperforms state-of-the-art baselines, particularly with larger surrogate-sample budgets. The work simplifies surrogate design for DFL, reduces data requirements, and broadens applicability to general decision-focused tasks, with future work focusing on smarter sample-generation strategies to further improve decision quality.

Abstract

In decision-making problems under uncertainty, predicting unknown parameters is often considered independent of the optimization part. Decision-focused learning (DFL) is a task-oriented framework that integrates prediction and optimization by adapting the predictive model to give better decisions for the corresponding task. Here, an inevitable challenge arises when computing the gradients of the optimal decision with respect to the parameters. Existing research copes with this issue by smoothly reforming surrogate optimization or constructing surrogate loss functions that mimic task loss. However, they are applied to restricted optimization domains. In this paper, we propose Locally Convex Global Loss Network (LCGLN), a global surrogate loss model that can be implemented in a general DFL paradigm. LCGLN learns task loss via a partial input convex neural network which is guaranteed to be convex for chosen inputs while keeping the non-convex global structure for the other inputs. This enables LCGLN to admit general DFL through only a single surrogate loss without any sense for choosing appropriate parametric forms. We confirm the effectiveness and flexibility of LCGLN by evaluating our proposed model with three stochastic decision-making problems.

Locally Convex Global Loss Network for Decision-Focused Learning

TL;DR

Abstract

Paper Structure (27 sections, 8 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 27 sections, 8 equations, 5 figures, 5 tables, 1 algorithm.

Introduction
Related Works
Preliminaries
Comparison on PFL and DFL
Surrogate DFL
Partial Input Convex Neural Network
Locally Convex Global Loss Network
Generating Samples
Learning Global Surrogate Loss LCGLN
Training Predictive Model
Experiments and Results
Experimental Settings
Problem Description
Baselines
Evaluation Metric
...and 12 more sections

Figures (5)

Figure 1: A model training pipeline for PFL, DFL, and surrogate DFL. PFL trains the predictive model by minimizing the prediction loss. DFL directly delivers gradients minimizing the task loss. Surrogate DFL first learns a surrogate loss model that follows the true task loss by sampling predictions and its task losses. Then, it trains the predictive model to convey useful gradients derived from the trained surrogate loss model in an end-to-end manner.
Figure 2: A simple example of a knapsack problem. There are two items valued $40, $30 each, marked with a yellow star. We predict the value of items and choose the higher one. Blue dots and red crosses are predicted values representing good and bad decisions respectively. PFL gives the same prediction loss for every prediction while DFL gives $10 loss in red cross and 0 in blue.
Figure 3: A line plot showing the normalized test regret $\mathcal{R}_{test} / \mathcal{R}_{worst}$ for each methodology and problem setting. The metric is lower the better, with 0 representing the optimal value. We tested sample size of $\{2,4,8,16,32\}$ for each problem. The standard error mean for each experiment is detailed in Appendix \ref{['subsec:appen-diff-smpsize']}. Our global surrogate loss LCGLN represented by the red straight line outperforms when 32 samples are used.
Figure 4: A histogram presenting normalized test regret $\mathcal{R}_{test} / \mathcal{R}_{worst}$ with standard error mean (SEM) for global surrogate loss models in budget allocation with varying number of fake targets. We use 16 samples for learning loss. The metric is lower the better and 0 when optimal. We test with $\{0,5,50,500\}$ fake targets, noting that the problem becomes more challenging as the number of fake targets increases. Our LCGLN shown in red bars outperforms most settings.
Figure 5: Architecture of Locally Convex Global Loss Network. The left figure represents the input to the first hidden layer. The right figure shows the hidden to hidden layers for non-convex vectors (reddish, upstream) and convex vectors (bluish, downstream). The output layer can simply be considered as $z_{i+1}$ without $u_{i+1}$.

Locally Convex Global Loss Network for Decision-Focused Learning

TL;DR

Abstract

Locally Convex Global Loss Network for Decision-Focused Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)