Hierarchical RL-MPC for Demand Response Scheduling
Maximilian Bloor, Ehecatl Antonio Del Rio Chanona, Calvin Tsay
TL;DR
The paper tackles demand response scheduling for air separation units under volatile electricity prices by proposing a hierarchical RL-LMPC framework that pairs reinforcement learning with a lower-level linear model predictive controller. It compares a direct RL approach with a control-informed architecture where the RL agent provides setpoints to an LMPC, finding that the RL-LMPC scheme improves sample efficiency and constraint satisfaction while maintaining competitive economic performance. The ASU case study demonstrates load-shifting behavior and better handling of operational constraints due to the LMPC’s explicit constraint management. Overall, the work advances a practical hybrid control strategy that blends data-driven decision-making with traditional control to enable flexible operation in process industries.
Abstract
This paper presents a hierarchical framework for demand response optimization in air separation units (ASUs) that combines reinforcement learning (RL) with linear model predictive control (LMPC). We investigate two control architectures: a direct RL approach and a control-informed methodology where an RL agent provides setpoints to a lower-level LMPC. The proposed RL-LMPC framework demonstrates improved sample efficiency during training and better constraint satisfaction compared to direct RL control. Using an industrial ASU case study, we show that our approach successfully manages operational constraints while optimizing electricity costs under time-varying pricing. Results indicate that the RL-LMPC architecture achieves comparable economic performance to direct RL while providing better robustness and requiring fewer training samples to converge. The framework offers a practical solution for implementing flexible operation strategies in process industries, bridging the gap between data-driven methods and traditional control approaches.
