Towards the Detection of Building Occupancy with Synthetic Environmental Data
Manuel Weber, Christoph Doblander, Peter Mandl
TL;DR
The paper tackles the challenge of obtaining accurate room-level occupancy labels for deep learning by leveraging synthetic data. It introduces a two-simulation framework to generate occupancy and $CO_2$-dynamics data, pre-trains a base model on this synthetic data, and transfers the learned knowledge to real rooms via weight initialization and fine-tuning. The key contributions are the occupancy and $CO_2$ simulation methods and a proof-of-concept showing that synthetic pre-training reduces the real-data requirement and improves robustness for occupancy detection. Empirical results demonstrate that the transfer model outperforms non-transfer baselines, enabling data-efficient, scalable deployment of $CO_2$-based occupancy detection in buildings, with potential generalization to multiple room types.
Abstract
Information about room-level occupancy is crucial to many building-related tasks, such as building automation or energy performance simulation. Current occupancy detection literature focuses on data-driven methods, but is mostly based on small case studies with few rooms. The necessity to collect room-specific data for each room of interest impedes applicability of machine learning, especially data-intensive deep learning approaches, in practice. To derive accurate predictions from less data, we suggest knowledge transfer from synthetic data. In this paper, we conduct an experiment with data from a CO$_2$ sensor in an office room, and additional synthetic data obtained from a simulation. Our contribution includes (a) a simulation method for CO$_2$ dynamics under randomized occupant behavior, (b) a proof of concept for knowledge transfer from simulated CO$_2$ data, and (c) an outline of future research implications. From our results, we can conclude that the transfer approach can effectively reduce the required amount of data for model training.
