DRL-Based Medium-Term Planning of Renewable-Integrated Self-Scheduling Cascaded Hydropower to Guide Wholesale Market Participation
Xianbang Chen, Yikui Liu, Neng Fan, Lei Wu
TL;DR
This work tackles medium-term planning for VS-CHPs to improve short-term wholesale market profits while meeting seasonal water-usage constraints. It introduces a DRL-based framework that casts medium-term planning as an MDP and trains a policy via Soft Actor-Critic, integrating short-term context such as VRES forecasts and LMPs. Two key innovations—an expertise-based mechanism for seasonal adaptivity and a multi-parametric programming–based acceleration—enable seasonally aware planning with dramatic training-time reductions. Real-world testing on PGE’s Pelton-Round Butte system shows up to 0.8% higher annual net revenue and substantial training-speed gains, highlighting the practical impact of context-informed, efficient DRL-based planning for VS-CHPs.
Abstract
For self-scheduling cascaded hydropower (S-CHP) facilities, medium-term planning is a critical step that coordinates water availability over the medium-term horizon, providing water usage guidance for their short-term operations in wholesale market participation. Typically, medium-term planning strategies (e.g., reservoir storage targets at the end of each short-term period) are determined by either optimization methods or rules of thumb. However, with the integration of variable renewable energy sources (VRESs), optimization-based methods suffer from deviations between the anticipated and actual reservoir storage, while rules of thumb could be financially conservative, thereby compromising short-term operating profitability in wholesale market participation. This paper presents a deep reinforcement learning (DRL)-based framework to derive medium-term planning policies for VRES-integrated S-CHPs (VS-CHPs), which can leverage contextual information underneath individual short-term periods and train planning policies by their induced short-term operating profits in wholesale market participation. The proposed DRL-based framework offers two practical merits. First, its planning strategies consider both seasonal requirements of reservoir storage and needs for short-term operating profits. Second, it adopts a multi-parametric programming-based strategy to accelerate the expensive training process associated with multi-step short-term operations. Finally, the DRL-based framework is evaluated on a real-world VS-CHP, demonstrating its advantages over current practice.
