Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems

Julian Ruddick; Glenn Ceusters; Gilles Van Kriekinge; Evgenii Genov; Cedric De Cauwer; Thierry Coosemans; Maarten Messagie

Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems

Julian Ruddick, Glenn Ceusters, Gilles Van Kriekinge, Evgenii Genov, Cedric De Cauwer, Thierry Coosemans, Maarten Messagie

TL;DR

The TreeC method, which does require building a realistic simulation for training, exhibits the safest operational performance, exceeding the grid limit by only 27.1 Wh compared to 593.9 Wh for reinforcement learning.

Abstract

Recent advancements in machine learning based energy management approaches, specifically reinforcement learning with a safety layer (OptLayerPolicy) and a metaheuristic algorithm generating a decision tree control policy (TreeC), have shown promise. However, their effectiveness has only been demonstrated in computer simulations. This paper presents the real-world validation of these methods, comparing against model predictive control and simple rule-based control benchmark. The experiments were conducted on the electrical installation of 4 reproductions of residential houses, which all have their own battery, photovoltaic and dynamic load system emulating a non-controllable electrical load and a controllable electric vehicle charger. The results show that the simple rules, TreeC, and model predictive control-based methods achieved similar costs, with a difference of only 0.6%. The reinforcement learning based method, still in its training phase, obtained a cost 25.5\% higher to the other methods. Additional simulations show that the costs can be further reduced by using a more representative training dataset for TreeC and addressing errors in the model predictive control implementation caused by its reliance on accurate data from various sources. The OptLayerPolicy safety layer allows safe online training of a reinforcement learning agent in the real-world, given an accurate constraint function formulation. The proposed safety layer method remains error-prone, nonetheless, it is found beneficial for all investigated methods. The TreeC method, which does require building a realistic simulation for training, exhibits the safest operational performance, exceeding the grid limit by only 27.1 Wh compared to 593.9 Wh for reinforcement learning.

Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems

TL;DR

Abstract

Paper Structure (32 sections, 14 equations, 14 figures, 8 tables)

This paper contains 32 sections, 14 equations, 14 figures, 8 tables.

Introduction
Contribution and outline
Related work
Method
Hardware setup
Experimental setup
Consumption profile
Electric vehicle and driver charging behaviour
Tariff
House switching
Enforced charging behaviour
Safety layer
Simulation
Energy management systems
Training data
...and 17 more sections

Figures (14)

Figure 1: Schematic of the experimental setup per house with from top to bottom: the PV installation with its peak power, the BESS with its capacity, the inverter with its maximum power, the grid connection with its phase specification and maximum power, the dynamic loads with their maximum power, the emulated house consumption with its average daily consumption and the EV charger with its maximum charge power.
Figure 2: Photograph of the experimental setup: with G being the grid connection equipped with the official digital meter of the local distribution system operator, B the BESS, S the solar PV installation inverters (solar panels equipped on the roof) and L the connectors to the dynamic loads (dimmers in separate cabinet and electrical loads stationed outdoors). The number represents the house numbers.
Figure 3: Household consumption profile: 10 seconds SFH 19 data schlemminger_dataset_2022 without PV, BESS, EV charging or heat pump having 3425.75 kWh/year, Std: 445.85 W, max: 9229 W.
Figure 4: The TreeC EMS obtained after training and pruning for house 1.
Figure 5: Representation of the experiment day starting on the 30th of May at 15:00. The upper plot shows the power profiles of the different assets of house 3 controlled by the RBC EMS. The grid power is represented in grey when the safety layer is not activated and in black when it is. The bottom plot shows the SOC of the BESS and EV. The enforced charging behaviour of the BESS is executed at the end of the experiment day to reach the 100% SOC goal at 15:00.
...and 9 more figures

Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems

TL;DR

Abstract

Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems

Authors

TL;DR

Abstract

Table of Contents

Figures (14)