Quantum Reinforcement Learning-Based Two-Stage Unit Commitment Framework for Enhanced Power Systems Robustness
Xiang Wei, Ziqing Zhu, Linghua Zhu, Ze Hu, Xian Zhang, Guibin Wang, Siqi Bu, Ka Wing Chan
TL;DR
This work addresses robust unit commitment (UC) under high renewable uncertainty by proposing a two-stage foresight-seeing UC framework that uses virtual power plants (VPPs) as flexible resources and leverages quantum reinforcement learning (QRL) to improve computational efficiency. It formulates the problem as a quantum Markov decision process (q-MDP) and solves it with parameterized quantum circuits (PQC), employing a discrete quantum DQN (Q-DQN) for day-ahead decisions and a quantum SAC (Q-SAC) for real-time control, with state encoding via density operators and transitions via quantum channels. The case study on a modified IEEE RTS 24-bus system demonstrates that QRL achieves faster convergence, lower constraint violations, and better runtime than DRL and traditional UC methods, validating the approach’s potential for robust, scalable power-system optimization. The results underscore the practical impact of quantum-enabled RL in enhancing reliability and responsiveness of power grids facing increasing renewable penetration and uncertainty, while leveraging VPPs to bolster system ramping and balancing capabilities.
Abstract
Unit commitment (UC) optimizes the start-up and shutdown schedules of generating units to meet load demand while minimizing costs. However, the increasing integration of renewable energy introduces uncertainties for real-time scheduling. Existing solutions face limitations both in modeling and algorithmic design. At the modeling level, they fail to incorporate widely adopted virtual power plants (VPPs) as flexibility resources, missing the opportunity to proactively mitigate potential real-time imbalances or ramping constraints through foresight-seeing decision-making. At the algorithmic level, existing probabilistic optimization, multi-stage approaches, and machine learning, face challenges in computational complexity and adaptability. To address these challenges, this study proposes a novel two-stage UC framework that incorporates foresight-seeing sequential decision-making in both day-ahead and real-time scheduling, leveraging VPPs as flexibility resources to proactively reserve capacity and ramping flexibility for upcoming renewable energy uncertainties over several hours. In particular, we develop quantum reinforcement learning (QRL) algorithms that integrate the foresight-seeing sequential decision-making and scalable computation advantages of deep reinforcement learning (DRL) with the parallel and high-efficiency search capabilities of quantum computing. Experimental results demonstrate that the proposed QRL-based approach outperforms in computational efficiency, real-time responsiveness, and solution quality.
