Table of Contents
Fetching ...

Evaluating the Impact of Data Availability on Machine Learning-augmented MPC for a Building Energy Management System

Jens Engel, Thomas Schmitt, Tobias Rodemann, Jürgen Adamy

TL;DR

The paper tackles the challenge of deploying MPC-based energy management in buildings when an accurate model is scarce. It proposes augmenting a gray-box, time-discrete state-space EMS with a data-driven residual estimator learned in a software-in-the-loop setup using a physics-based digital twin, operating under a horizon of $N_p=48$ steps with $T_s=0.5\,\mathrm{h}$ and a multi-objective $J_{\mathrm{opt}}$. Two regressors predict the exogenous residuals $\epsilon(k)$ for building and server zones using features like $P_{\mathrm{dem}}(k)$, $\vartheta_{\mathrm{air}}(k)$, ToD, DoW, and server loads, with XGBoost performing best for building zones and a linear regressor for server zones. Findings show that acceptable estimator and controller performance can be achieved with limited data, and that leveraging historical data through incremental retraining further improves efficacy, informing practical data strategies for real-world deployment of residual-augmented MPC in building energy management.

Abstract

A major challenge in the development of Model Predictive Control (MPC)-based energy management systems (EMSs) for buildings is the availability of an accurate model. One approach to address this is to augment an existing gray-box model with data-driven residual estimators. The efficacy of such estimators, and hence the performance of the EMS, relies on the availability of sufficient and suitable training data. In this work, we evaluate how different data availability scenarios affect estimator and controller performance. To do this, we perform software-in-the-loop (SiL) simulation with a physics-based digital twin using real measurement data. Simulation results show that acceptable estimation and control performance can already be achieved with limited available data, and we confirm that leveraging historical data for pretraining boosts efficacy.

Evaluating the Impact of Data Availability on Machine Learning-augmented MPC for a Building Energy Management System

TL;DR

The paper tackles the challenge of deploying MPC-based energy management in buildings when an accurate model is scarce. It proposes augmenting a gray-box, time-discrete state-space EMS with a data-driven residual estimator learned in a software-in-the-loop setup using a physics-based digital twin, operating under a horizon of steps with and a multi-objective . Two regressors predict the exogenous residuals for building and server zones using features like , , ToD, DoW, and server loads, with XGBoost performing best for building zones and a linear regressor for server zones. Findings show that acceptable estimator and controller performance can be achieved with limited data, and that leveraging historical data through incremental retraining further improves efficacy, informing practical data strategies for real-world deployment of residual-augmented MPC in building energy management.

Abstract

A major challenge in the development of Model Predictive Control (MPC)-based energy management systems (EMSs) for buildings is the availability of an accurate model. One approach to address this is to augment an existing gray-box model with data-driven residual estimators. The efficacy of such estimators, and hence the performance of the EMS, relies on the availability of sufficient and suitable training data. In this work, we evaluate how different data availability scenarios affect estimator and controller performance. To do this, we perform software-in-the-loop (SiL) simulation with a physics-based digital twin using real measurement data. Simulation results show that acceptable estimation and control performance can already be achieved with limited available data, and we confirm that leveraging historical data for pretraining boosts efficacy.
Paper Structure (7 sections, 4 figures, 1 table)

This paper contains 7 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: wmare over the course of the year 2023 in the different scenarios.
  • Figure 2: Letter-value plot showing the distribution of the ware compared between the different scenarios. The central quantile represents $50\,\%\xspace$ of the data, which are then halved at every next quantile, i.e. $25\,\%\xspace$, $12.5\,\%\xspace$, and so forth.
  • Figure 3: rmse of the temperature tracking over the course of the year 2023 in the different scenarios.
  • Figure 4: wmre over the course of the year 2023 in the different scenarios.