Decision-Focused Learning for Neural Network-Constrained HVAC Scheduling
Pietro Favaro, Jean-François Toubeau, François Vallée, Yury Dvorkin
TL;DR
This work tackles day-ahead HVAC scheduling by embedding a neural network model of building thermal dynamics into a mixed-integer optimization framework. It introduces stochastic-smoothing decision-focused learning (SS-DFL) to train NN parameters directly for improved decision quality, circumventing non-differentiability of the MILP/MIQP through score-function gradients. The approach uses an MIQP formulation with adaptive tightness for ReLU activations and interval analysis, enabling stable and efficient training. A five-zone Denver office case demonstrates that SS-DFL outperforms traditional identify-then-optimize and relaxed DFL methods in ex-post costs and thermal comfort, with notable gains in solving time and grid-services performance.
Abstract
Heating, Ventilation, and Air Conditioning (HVAC) is a major electricity end-use with a substantial potential for providing grid services, such as demand response. Harnessing this flexibility requires accurate modeling of the thermal dynamics of buildings, a difficult task because nonlinear heat transfer and recurring daily cycles make historical data highly correlated and insufficient to generalize to new weather, occupancy, and control scenarios. This paper presents an HVAC management system formulated as a Mixed Integer Quadratic Program (MIQP), where Neural Network (NN) models of thermal dynamics are embedded as exact mixed-integer linear constraints. Unlike traditional training approaches that minimize prediction errors, we employ Decision-Focused Learning (DFL) to learn the NN parameters with the objective of directly improving the HVAC cost performance. However, the discrete nature of MIQP hinders DFL, as it leads to undefined and discontinuous gradients, thus impeding standard gradient-based training. We leverage Stochastic Smoothing (SS) to enable efficient gradient computation without the need to differentiate the MIQP. Experiments on a realistic five-zone building using a high-fidelity simulator demonstrate that the proposed SS-DFL approach outperforms conventional identify-then-optimize (i.e., the thermal dynamics model is identified on historical data then used in optimization) and relaxed DFL methods in both cost savings and grid service performance, highlighting its potential for scalable, grid-aware building control.
