Perturbed Decision-Focused Learning for Modeling Strategic Energy Storage

Ming Yi; Saud Alghumayjan; Bolun Xu

Perturbed Decision-Focused Learning for Modeling Strategic Energy Storage

Ming Yi, Saud Alghumayjan, Bolun Xu

TL;DR

The paper tackles predicting and controlling energy storage actions by embedding a physical storage model inside a differentiable, end-to-end learning pipeline. It introduces a perturbed, convex, and smooth decision-focused loss to enable backpropagation through a model-predictive-control–like optimization layer, and couples this with a hybrid loss that regularizes the reward predictor toward price signals. The approach is validated on two tasks—self-scheduling energy storage arbitrage and energy storage behavior prediction—showing meaningful profit gains and superior action-prediction accuracy against strong baselines across synthetic and real-world data. This framework offers a principled, data-efficient path for regulators and operators to reason about storage decisions under uncertainty while respecting physical and degradation constraints.

Abstract

This paper presents a novel decision-focused framework integrating the physical energy storage model into machine learning pipelines. Motivated by the model predictive control for energy storage, our end-to-end method incorporates the prior knowledge of the storage model and infers the hidden reward that incentivizes energy storage decisions. This is achieved through a dual-layer framework, combining a prediction layer with an optimization layer. We introduce the perturbation idea into the designed decision-focused loss function to ensure the differentiability over linear storage models, supported by a theoretical analysis of the perturbed loss function. We also develop a hybrid loss function for effective model training. We provide two challenging applications for our proposed framework: energy storage arbitrage, and energy storage behavior prediction. The numerical experiments on real price data demonstrate that our arbitrage approach achieves the highest profit against existing methods. The numerical experiments on synthetic and real-world energy storage data show that our approach achieves the best behavior prediction performance against existing benchmark methods, which shows the effectiveness of our method.

Perturbed Decision-Focused Learning for Modeling Strategic Energy Storage

TL;DR

Abstract

Paper Structure (20 sections, 8 theorems, 55 equations, 10 figures, 6 tables, 2 algorithms)

This paper contains 20 sections, 8 theorems, 55 equations, 10 figures, 6 tables, 2 algorithms.

Introduction
Motivation and Related Works
Learning-aided Storage Operation
Learning-aided Storage Monitoring
Decision-Focused Learning
Problem Formulation
Energy Storage Arbitrage Model
Problem Statement
Methodology
The Prediction Layer and the Optimization Layer
The Learning Approach
Convex and Smooth Perturbed Loss Function
Algorithmic Implementation
Experiments
Self-Scheduling Energy Storage Arbitrage
...and 5 more sections

Key Result

Proposition 1

The differentiablity of perturbed function. As noise $\bm{Z}$ is from Gaussian distribution, it has the density $\rho(\bm{Z})\propto \text{exp}(-\psi(\bm{Z}))$. For $\mathcal{R}_{\mathcal{X}}=\underset{{y\in \mathcal{X}}}{\text{max}}||\bm{y}_{\epsilon}^*(\hat{\bm{\lambda}})||$, we have

Figures (10)

Figure 1: The pipeline of proposed decision-focused prediction approach. Given the input features, the neural network-based predictor first predicts the hidden reward. Subsequently, the optimization layer utilizes this hidden reward to calculate the decision by solving an optimization problem. The algorithm then conducts backward propagation to update the weights in the predictor, based on a perturbed decision-focused loss function.
Figure 2: Comparison of annual accumulative profits between the proposed approach and three benchmark methods. The energy storage model is with (a) linear cost term, (b) linear and quadratic cost terms.
Figure 3: Comparison of reward predictions and corresponding decisions between the proposed approach, two benchmark methods, and ground truth. The first subfigure shows the reward predictions and the ground truth real-time price (RTP). The subsequent subfigures show the corresponding decisions compared to the optimal decisions.
Figure 4: Arbitrage performance based on training on the original energy storage system and on a storage system with different parameters.
Figure 5: Comparison of ground-truth and predictions from the proposed approach and two benchmark methods: storage model with linear cost.
...and 5 more figures

Theorems & Definitions (13)

Proposition 1
Proposition 2
Proposition 3
Proposition 4
Proposition 1
proof
Proposition 2
proof
Proposition 3
proof
...and 3 more

Perturbed Decision-Focused Learning for Modeling Strategic Energy Storage

TL;DR

Abstract

Perturbed Decision-Focused Learning for Modeling Strategic Energy Storage

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (13)