Table of Contents
Fetching ...

A Receding Horizon Reinforcement Learning Framework for Campus Chiller Energy Management - A case study from an Australian University

Laura Musgrave, Arnab Bhattacharjee, Tapan Kumar Saha

TL;DR

This study tackles campus chiller energy management by formulating a receding-horizon reinforcement learning problem to optimally schedule multiple, heterogeneous chillers. It integrates a PPO agent with 24-hour ahead planning and a transformer-based TimeXer forecaster to predict building cooling demand, while using a prioritized reward to enforce hard physical constraints. A physics-informed chiller power model and PLR relationships drive energy minimization over the horizon. Experimental results on a nine-building Australian campus show up to 28% electricity savings over a rule-based baseline, with improved COP and constraint satisfaction, demonstrating the practical potential of data-driven, horizon-aware HVAC control while highlighting limitations related to pipe losses, reward automation, and online retraining.

Abstract

This work presents a case study of optimal energy management of a large Heating Ventilation and Cooling (HVAC) system within a university campus in Australia using Reinforcement Learning (RL). The HVAC system supplies to nine university buildings with an annual average electricity consumption of $\sim2$ GWh. Updated chiller Coefficient of Performance (COP) curves are identified, and a predictive building cooling demand model is developed using historical data from the HVAC system. Based on these inputs, a Proximal Policy Optimization based RL model is trained to optimally schedule the chillers in a receding horizon control framework with a priority reward function for constraint satisfaction. Compared to the traditional way of controlling the HVAC system based on a reactive rule-based method, the proposed controller saves up to 28\% of the electricity consumed by simply controlling the mass flow rates of the chiller banks and with minimal constraint violations.

A Receding Horizon Reinforcement Learning Framework for Campus Chiller Energy Management - A case study from an Australian University

TL;DR

This study tackles campus chiller energy management by formulating a receding-horizon reinforcement learning problem to optimally schedule multiple, heterogeneous chillers. It integrates a PPO agent with 24-hour ahead planning and a transformer-based TimeXer forecaster to predict building cooling demand, while using a prioritized reward to enforce hard physical constraints. A physics-informed chiller power model and PLR relationships drive energy minimization over the horizon. Experimental results on a nine-building Australian campus show up to 28% electricity savings over a rule-based baseline, with improved COP and constraint satisfaction, demonstrating the practical potential of data-driven, horizon-aware HVAC control while highlighting limitations related to pipe losses, reward automation, and online retraining.

Abstract

This work presents a case study of optimal energy management of a large Heating Ventilation and Cooling (HVAC) system within a university campus in Australia using Reinforcement Learning (RL). The HVAC system supplies to nine university buildings with an annual average electricity consumption of GWh. Updated chiller Coefficient of Performance (COP) curves are identified, and a predictive building cooling demand model is developed using historical data from the HVAC system. Based on these inputs, a Proximal Policy Optimization based RL model is trained to optimally schedule the chillers in a receding horizon control framework with a priority reward function for constraint satisfaction. Compared to the traditional way of controlling the HVAC system based on a reactive rule-based method, the proposed controller saves up to 28\% of the electricity consumed by simply controlling the mass flow rates of the chiller banks and with minimal constraint violations.

Paper Structure

This paper contains 28 sections, 14 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: A pictorial representation of a campus chiller network supplying to multiple buildings.
  • Figure 2: Building load forecast with TimeXer
  • Figure 3: Training results for the RL agent.
  • Figure 4: Summary of results from the controller evaluation.