Table of Contents
Fetching ...

Task-based End-to-end Model Learning in Stochastic Optimization

Priya L. Donti, Brandon Amos, J. Zico Kolter

TL;DR

The paper addresses the gap between predictive loss minimization and the actual performance of decision pipelines under uncertainty. It introduces an end-to-end framework that learns probabilistic models by differentiating through the stochastic programming solution to directly minimize a task-based loss. Across inventory management, electrical grid scheduling, and battery storage arbitrage, the approach outperforms traditional maximum-likelihood learning and purely policy-based methods, demonstrating improved task performance and robustness to model misspecification. The work provides a practical pathway to align learning with end-use objectives and lays groundwork for extensions to multi-round optimization and model predictive control.

Abstract

With the increasing popularity of machine learning techniques, it has become common to see prediction algorithms operating within some larger process. However, the criteria by which we train these algorithms often differ from the ultimate criteria on which we evaluate them. This paper proposes an end-to-end approach for learning probabilistic machine learning models in a manner that directly captures the ultimate task-based objective for which they will be used, within the context of stochastic programming. We present three experimental evaluations of the proposed approach: a classical inventory stock problem, a real-world electrical grid scheduling task, and a real-world energy storage arbitrage task. We show that the proposed approach can outperform both traditional modeling and purely black-box policy optimization approaches in these applications.

Task-based End-to-end Model Learning in Stochastic Optimization

TL;DR

The paper addresses the gap between predictive loss minimization and the actual performance of decision pipelines under uncertainty. It introduces an end-to-end framework that learns probabilistic models by differentiating through the stochastic programming solution to directly minimize a task-based loss. Across inventory management, electrical grid scheduling, and battery storage arbitrage, the approach outperforms traditional maximum-likelihood learning and purely policy-based methods, demonstrating improved task performance and robustness to model misspecification. The work provides a practical pathway to align learning with end-use objectives and lays groundwork for extensions to multi-round optimization and model predictive control.

Abstract

With the increasing popularity of machine learning techniques, it has become common to see prediction algorithms operating within some larger process. However, the criteria by which we train these algorithms often differ from the ultimate criteria on which we evaluate them. This paper proposes an end-to-end approach for learning probabilistic machine learning models in a manner that directly captures the ultimate task-based objective for which they will be used, within the context of stochastic programming. We present three experimental evaluations of the proposed approach: a classical inventory stock problem, a real-world electrical grid scheduling task, and a real-world energy storage arbitrage task. We show that the proposed approach can outperform both traditional modeling and purely black-box policy optimization approaches in these applications.

Paper Structure

This paper contains 21 sections, 28 equations, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: Features $x$, model predictions $y$, and policy $z$ for the three experiments.
  • Figure 2: Inventory problem results for 10 runs over a representative instantiation of true parameters ($c_0 = 10, q_0 = 2, c_b = 30, q_b = 14, c_h = 10, q_h=2$). Cost is evaluated over 1000 testing samples (lower is better). The linear MLE performs best for a true linear model. In all other cases, the task-based models outperform their MLE and policy counterparts.
  • Figure 3: 2-hidden-layer neural network to predict hourly electric load for the next day.
  • Figure 4: Results for 10 runs of the generation-scheduling problem for representative decision parameters $\gamma_e = 0.5, \gamma_s = 50,$ and $c_r = 0.4$. (Lower loss is better.) As expected, the RMSE net achieves the lowest RMSE for its predictions. However, the task net outperforms the RMSE net on task loss by 38.6%, and the cost-weighted RMSE on task loss by 8.6%.
  • Figure : Task Loss Optimization
  • ...and 1 more figures