Optimal Control of Renewable Energy Communities subject to Network Peak Fees with Model Predictive Control and Reinforcement Learning Algorithms

Samy Aittahar; Adrien Bolland; Guillaume Derval; Damien Ernst

Optimal Control of Renewable Energy Communities subject to Network Peak Fees with Model Predictive Control and Reinforcement Learning Algorithms

Samy Aittahar, Adrien Bolland, Guillaume Derval, Damien Ernst

TL;DR

This paper addresses the problem of minimizing the global electricity bills of Renewable Energy Communities (RECs) that include peak-grid costs by controlling reachable assets within a centralized framework. It models REC operation as a Partially Observable Markov Decision Process (POMDP) and proposes two practical control approaches: Model Predictive Control (MPC) and Reinforcement Learning (RL), with variants that account for peak costs and variants that ignore them. The authors formalize the optimal energy reallocation scheme that minimizes the ex-post REC bills, and implement MPC (MILP-based) and RL (PPO-based with recurrent networks) policies, comparing them to baseline self-consumption strategies on synthetic (REC-2) and real-data-based (REC-7) RECs. Results show that peak-aware policies substantially reduce total bills, with MPC achieving the best performance at the expense of computation time, while RL offers promising real-time control after training. The work highlights the potential of centralized REC control and identifies scalability challenges, suggesting GPU-accelerated solvers, surrogate approximations for reallocation, and multi-agent RL as future directions.

Abstract

We propose in this paper an optimal control framework for renewable energy communities (RECs) equipped with controllable assets. Such RECs allow its members to exchange production surplus through an internal market. The objective is to control their assets in order to minimise the sum of individual electricity bills. These bills account for the electricity exchanged through the REC and with the retailers. Typically, for large companies, another important part of the bills are the costs related to the power peaks; in our framework, they are determined from the energy exchanges with the retailers. We compare rule-based control strategies with the two following control algorithms. The first one is derived from model predictive control techniques, and the second one is built with reinforcement learning techniques. We also compare variants of these algorithms that neglect the peak power costs. Results confirm that using policies accounting for the power peaks lead to a significantly lower sum of electricity bills and thus better control strategies at the cost of higher computation time. Furthermore, policies trained with reinforcement learning approaches appear promising for real-time control of the communities, where model predictive control policies may be computationally expensive in practice. These findings encourage pursuing the efforts toward development of scalable control algorithms, operating from a centralised standpoint, for renewable energy communities equipped with controllable assets.

Optimal Control of Renewable Energy Communities subject to Network Peak Fees with Model Predictive Control and Reinforcement Learning Algorithms

TL;DR

Abstract

Paper Structure (47 sections, 36 equations, 10 figures, 10 tables)

This paper contains 47 sections, 36 equations, 10 figures, 10 tables.

Introduction
Related work
Decision-making issues for RECs.
MPC and RL for microgrids.
MPC and RL for REC.
Optimal REC Control Problem
REC Structure and Management
Optimal reallocation scheme
Decision process associated with RECs
State space
Exogenous space
Action space
Transition dynamics
Cost function
Optimal control of RECs
...and 32 more sections

Figures (10)

Figure 1: Renewable energy community timeline. During a billing period, each member consumes and produces in real time electricity on their own and by the usage of their controllable assets; in the timeline, black ticks refer to time steps during which control actions are taken. At the end of a billing period, the ECM computes the (optimal) reallocation of the REC production for each market period. This reallocation creates new meter readings and these are passed on by the ECM to the DSO and the retailers to compute the global REC bill.
Figure 2: Illustration of how MPC policies compute the next action to apply in the REC dynamical system from a given state and a sequence of exogenous variables. The latter is composed of the current exogenous variable and values for the future ones (up to the policy horizon $t+K$). Colors are used to differentiate the billing periods that the MPC policies consider during its optimisation process.
Figure 3: Expected returns of MPC policies (averaged over $16$ runs) in REC-2, given policy horizon and foresight efficiency, along with expected returns of RL and baselines policies. Recall that $\alpha$ is the foresight efficiency defined in Section \ref{['mpcpolicy']}, with $\alpha=1$ corresponding to perfect foresight.
Figure 4: Expected returns of MPC policies (averaged over $16$ runs) in REC-7, given policy horizon and foresight efficiency, along with expected returns of RL and baselines policies.
Figure 5: Consumption/production profiles of members in REC-2. Member M1 does not produce electricity and member M2 does not consume electricity (through their respective non-controllable assets).
...and 5 more figures

Optimal Control of Renewable Energy Communities subject to Network Peak Fees with Model Predictive Control and Reinforcement Learning Algorithms

TL;DR

Abstract

Optimal Control of Renewable Energy Communities subject to Network Peak Fees with Model Predictive Control and Reinforcement Learning Algorithms

Authors

TL;DR

Abstract

Table of Contents

Figures (10)