Perturbed Decision-Focused Learning for Modeling Strategic Energy Storage
Ming Yi, Saud Alghumayjan, Bolun Xu
TL;DR
The paper tackles predicting and controlling energy storage actions by embedding a physical storage model inside a differentiable, end-to-end learning pipeline. It introduces a perturbed, convex, and smooth decision-focused loss to enable backpropagation through a model-predictive-control–like optimization layer, and couples this with a hybrid loss that regularizes the reward predictor toward price signals. The approach is validated on two tasks—self-scheduling energy storage arbitrage and energy storage behavior prediction—showing meaningful profit gains and superior action-prediction accuracy against strong baselines across synthetic and real-world data. This framework offers a principled, data-efficient path for regulators and operators to reason about storage decisions under uncertainty while respecting physical and degradation constraints.
Abstract
This paper presents a novel decision-focused framework integrating the physical energy storage model into machine learning pipelines. Motivated by the model predictive control for energy storage, our end-to-end method incorporates the prior knowledge of the storage model and infers the hidden reward that incentivizes energy storage decisions. This is achieved through a dual-layer framework, combining a prediction layer with an optimization layer. We introduce the perturbation idea into the designed decision-focused loss function to ensure the differentiability over linear storage models, supported by a theoretical analysis of the perturbed loss function. We also develop a hybrid loss function for effective model training. We provide two challenging applications for our proposed framework: energy storage arbitrage, and energy storage behavior prediction. The numerical experiments on real price data demonstrate that our arbitrage approach achieves the highest profit against existing methods. The numerical experiments on synthetic and real-world energy storage data show that our approach achieves the best behavior prediction performance against existing benchmark methods, which shows the effectiveness of our method.
