Table of Contents
Fetching ...

Physical Informed-Inspired Deep Reinforcement Learning Based Bi-Level Programming for Microgrid Scheduling

Yang Li, Jiankai Gao, Yuanzheng Li, Chen Chen, Sen Li, Mohammad Shahidehpour, Zhe Chen

TL;DR

The paper addresses coordinating operator and user interests in a microgrid under uncertainty by formulating a bi-level scheduling problem that leverages TCL flexibility and demand response. It introduces a physics-informed DRL-based solution where the upper level uses AutoML-PER-A3C to optimize pricing and the lower level employs DOCPLEX to minimize user costs, with lower-level decisions fed back as states to the upper level. The main contributions are the integration of AutoML with PER-enhanced A3C, a novel state-based handling of lower-level decisions, and a physical-informed reward guiding policy learning, demonstrated on an IEEE 30-bus test system with superior economic viability and computational efficiency compared to other RL methods. This approach advances MG operation by enabling robust, generalizable, and efficient coordination of multiple stakeholders through data-driven bi-level optimization and rigorous lower-level optimization.

Abstract

To coordinate the interests of operator and users in a microgrid under complex and changeable operating conditions, this paper proposes a microgrid scheduling model considering the thermal flexibility of thermostatically controlled loads and demand response by leveraging physical informed-inspired deep reinforcement learning (DRL) based bi-level programming. To overcome the non-convex limitations of karush-kuhn-tucker (KKT)-based methods, a novel optimization solution method based on DRL theory is proposed to handle the bi-level programming through alternate iterations between levels. Specifically, by combining a DRL algorithm named asynchronous advantage actor-critic (A3C) and automated machine learning-prioritized experience replay (AutoML-PER) strategy to improve the generalization performance of A3C to address the above problems, an improved A3C algorithm, called AutoML-PER-A3C, is designed to solve the upper-level problem; while the DOCPLEX optimizer is adopted to address the lower-level problem. In this solution process, AutoML is used to automatically optimize hyperparameters and PER improves learning efficiency and quality by extracting the most valuable samples. The test results demonstrate that the presented approach manages to reconcile the interests between multiple stakeholders in MG by fully exploiting various flexibility resources. Furthermore, in terms of economic viability and computational efficiency, the proposal vastly exceeds other advanced reinforcement learning methods.

Physical Informed-Inspired Deep Reinforcement Learning Based Bi-Level Programming for Microgrid Scheduling

TL;DR

The paper addresses coordinating operator and user interests in a microgrid under uncertainty by formulating a bi-level scheduling problem that leverages TCL flexibility and demand response. It introduces a physics-informed DRL-based solution where the upper level uses AutoML-PER-A3C to optimize pricing and the lower level employs DOCPLEX to minimize user costs, with lower-level decisions fed back as states to the upper level. The main contributions are the integration of AutoML with PER-enhanced A3C, a novel state-based handling of lower-level decisions, and a physical-informed reward guiding policy learning, demonstrated on an IEEE 30-bus test system with superior economic viability and computational efficiency compared to other RL methods. This approach advances MG operation by enabling robust, generalizable, and efficient coordination of multiple stakeholders through data-driven bi-level optimization and rigorous lower-level optimization.

Abstract

To coordinate the interests of operator and users in a microgrid under complex and changeable operating conditions, this paper proposes a microgrid scheduling model considering the thermal flexibility of thermostatically controlled loads and demand response by leveraging physical informed-inspired deep reinforcement learning (DRL) based bi-level programming. To overcome the non-convex limitations of karush-kuhn-tucker (KKT)-based methods, a novel optimization solution method based on DRL theory is proposed to handle the bi-level programming through alternate iterations between levels. Specifically, by combining a DRL algorithm named asynchronous advantage actor-critic (A3C) and automated machine learning-prioritized experience replay (AutoML-PER) strategy to improve the generalization performance of A3C to address the above problems, an improved A3C algorithm, called AutoML-PER-A3C, is designed to solve the upper-level problem; while the DOCPLEX optimizer is adopted to address the lower-level problem. In this solution process, AutoML is used to automatically optimize hyperparameters and PER improves learning efficiency and quality by extracting the most valuable samples. The test results demonstrate that the presented approach manages to reconcile the interests between multiple stakeholders in MG by fully exploiting various flexibility resources. Furthermore, in terms of economic viability and computational efficiency, the proposal vastly exceeds other advanced reinforcement learning methods.

Paper Structure

This paper contains 38 sections, 39 equations, 15 figures, 4 tables, 1 algorithm.

Figures (15)

  • Figure 1: Schematic diagram of the studied microgrid.
  • Figure 2: DRL hyperparameter optimization based on the AutoML.
  • Figure 3: Workflow of the proposed AutoML-PER-A3C.
  • Figure 4: MG one-line graph using modified IEEE 30-bus system.
  • Figure 5: Change of the outdoor temperatures.
  • ...and 10 more figures