Table of Contents
Fetching ...

TreeC: a method to generate interpretable energy management systems using a metaheuristic algorithm

Julian Ruddick, Luis Ramirez Camargo, Muhammad Andy Putratama, Maarten Messagie, Thierry Coosemans

TL;DR

TreeC presents a CMA-ES-based method to directly optimize interpretable, decision-tree energy management systems (EMS) learned from historical data. By encoding a complete binary tree on $3N+1$ variables and pruning unused leaves, TreeC yields transparent control policies with competitive performance relative to MPC and RL across two benchmark cases. The approach emphasizes interpretability and reproducibility, demonstrating that simple, explainable trees can closely match or exceed the performance of more complex black-box or non-learning baselines in energy grids and building heating scenarios. Limitations include lack of online learning and tree-size constraints, with proposed improvements focusing on richer leaf outputs, mixed discrete-continuous optimization, and broader real-world validation.

Abstract

Energy management systems (EMS) have traditionally been implemented using rule-based control (RBC) and model predictive control (MPC) methods. However, recent research has explored the use of reinforcement learning (RL) as a promising alternative. This paper introduces TreeC, a machine learning method that utilizes the covariance matrix adaptation evolution strategy metaheuristic algorithm to generate an interpretable EMS modeled as a decision tree. Unlike RBC and MPC approaches, TreeC learns the decision strategy of the EMS based on historical data, adapting the control model to the controlled energy grid. The decision strategy is represented as a decision tree, providing interpretability compared to RL methods that often rely on black-box models like neural networks. TreeC is evaluated against MPC with perfect forecast and RL EMSs in two case studies taken from literature: an electric grid case and a household heating case. In the electric grid case, TreeC achieves an average energy loss and constraint violation score of 19.2, which is close to MPC and RL EMSs that achieve scores of 14.4 and 16.2 respectively. All three methods control the electric grid well especially when compared to the random EMS, which obtains an average score of 12 875. In the household heating case, TreeC performs similarly to MPC on the adjusted and averaged electricity cost and total discomfort (0.033 EUR/m$^2$ and 0.42 Kh for TreeC compared to 0.037 EUR/m$^2$ and 2.91 kH for MPC), while outperforming RL (0.266 EUR/m$^2$ and 24.41 Kh).

TreeC: a method to generate interpretable energy management systems using a metaheuristic algorithm

TL;DR

TreeC presents a CMA-ES-based method to directly optimize interpretable, decision-tree energy management systems (EMS) learned from historical data. By encoding a complete binary tree on variables and pruning unused leaves, TreeC yields transparent control policies with competitive performance relative to MPC and RL across two benchmark cases. The approach emphasizes interpretability and reproducibility, demonstrating that simple, explainable trees can closely match or exceed the performance of more complex black-box or non-learning baselines in energy grids and building heating scenarios. Limitations include lack of online learning and tree-size constraints, with proposed improvements focusing on richer leaf outputs, mixed discrete-continuous optimization, and broader real-world validation.

Abstract

Energy management systems (EMS) have traditionally been implemented using rule-based control (RBC) and model predictive control (MPC) methods. However, recent research has explored the use of reinforcement learning (RL) as a promising alternative. This paper introduces TreeC, a machine learning method that utilizes the covariance matrix adaptation evolution strategy metaheuristic algorithm to generate an interpretable EMS modeled as a decision tree. Unlike RBC and MPC approaches, TreeC learns the decision strategy of the EMS based on historical data, adapting the control model to the controlled energy grid. The decision strategy is represented as a decision tree, providing interpretability compared to RL methods that often rely on black-box models like neural networks. TreeC is evaluated against MPC with perfect forecast and RL EMSs in two case studies taken from literature: an electric grid case and a household heating case. In the electric grid case, TreeC achieves an average energy loss and constraint violation score of 19.2, which is close to MPC and RL EMSs that achieve scores of 14.4 and 16.2 respectively. All three methods control the electric grid well especially when compared to the random EMS, which obtains an average score of 12 875. In the household heating case, TreeC performs similarly to MPC on the adjusted and averaged electricity cost and total discomfort (0.033 EUR/m and 0.42 Kh for TreeC compared to 0.037 EUR/m and 2.91 kH for MPC), while outperforming RL (0.266 EUR/m and 24.41 Kh).
Paper Structure (17 sections, 6 equations, 8 figures, 3 tables)

This paper contains 17 sections, 6 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Illustration of TreeC for a household heating case. Fig. \ref{['fig:encoding']} shows an decision tree energy management system and how it is encoded as a list of numbers, Fig. \ref{['fig:optimisation_proc']} shows the optimisation process of the covariance matrix adaptation evolution strategy algorithm and Fig. \ref{['fig:pruning']} shows an example of how a tree is pruned when a leaf node is not visited.
  • Figure 2: Summary of the optimisation/training, pruning and validation method. The decision trees represent different evaluated candidates with the score they obtained beneath them and uses the same color code for variables as in Fig. \ref{['fig:method_viz']}. The decision trees with yellow scores are the best performing ones and taken to the next step. The optimisation is executed 5 times in parallel as described in Section \ref{['sec:par_train']}.
  • Figure 3: ANM6easy case visualisation for a typical working hour period using the EMS presented in Fig. \ref{['fig:anm_trees']}. All the assets are shown with their respective real and reactive power as well as the constraints on the grid lines and voltage magnitude of the busbars henry_gym-anm_2021-1.
  • Figure 4: Swarm plot representation of the results obtained for the ANM6easy case with each point being the validation of independant EMSs and the horizontal lines being the validation of the MPC EMSs. The TreeC and SAC EMSs both obtain results close to the MPC with perfect forecast. Disregarding its outliers, SAC performs slightly better than TreeC. PPO performs worse than the two other methods. Three outliers for PPO with a score higher than 130 are not shown in the plot.
  • Figure 5: TreeC EMS which obtained the best score in Fig. \ref{['fig:ANM_all_box']}. Each tree controls the real power or reactive power of each controllable asset. To keep them simple for visualisation, nodes that were visited 10 times or less during evaluation were removed.
  • ...and 3 more figures