BUILDA: A Thermal Building Data Generation Framework for Transfer Learning
Thomas Krug, Fabian Raisch, Dominik Aimer, Markus Wirnsberger, Ferdinand Sigg, Benjamin Schäfer, Benjamin Tischler
TL;DR
The paper addresses the data bottleneck in transfer learning for building thermal dynamics by introducing BuilDa, a framework that generates large-scale, high-fidelity synthetic time-series data without requiring advanced building-simulation expertise. It uses a validated single-zone Modelica model exported as an FMU and simulated in Python, with a converter layer and configurable parameters to produce diverse data, including weather, occupancy, and control variations. The framework supports parallel data generation and flexible metadata, making it suitable for TL research and building-similarity studies. A TL demonstration shows pretraining on multiple source configurations and fine-tuning to a target yields improved predictive accuracy compared with training from scratch, highlighting the practical impact of accessible synthetic data for transfer learning in building physics. The work sets the stage for broader TL applications, including generalized models, reinforcement learning, and multi-zone extensions.
Abstract
Transfer learning (TL) can improve data-driven modeling of building thermal dynamics. Therefore, many new TL research areas emerge in the field, such as selecting the right source model for TL. However, these research directions require massive amounts of thermal building data which is lacking presently. Neither public datasets nor existing data generators meet the needs of TL research in terms of data quality and quantity. Moreover, existing data generation approaches typically require expert knowledge in building simulation. We present BuilDa, a thermal building data generation framework for producing synthetic data of adequate quality and quantity for TL research. The framework does not require profound building simulation knowledge to generate large volumes of data. BuilDa uses a single-zone Modelica model that is exported as a Functional Mock-up Unit (FMU) and simulated in Python. We demonstrate BuilDa by generating data and utilizing it for pretraining and fine-tuning TL models.
