Learning and Generating Diverse Residential Load Patterns Using GAN with Weakly-Supervised Training and Weight Selection
Xinyu Liang, Hao Wang
TL;DR
This work tackles the scarcity of high-quality residential load data by introducing RLP-GAN, a weakly-supervised GAN that integrates an over-complete autoencoder and Bi-LSTM components to learn diverse, temporally-rich household load patterns. It employs a three-stage training regime (autoencoder, supervisor, and joint adversarial training) and a Fréchet-distance-based model weight selection to mitigate mode collapse. Evaluations on real-world data from 417 households show that RLP-GAN outperforms four strong baselines (ACGAN, WGAN, C-RNN-GAN, DDPM) in terms of diversity and distribution fidelity, and a public synthetic dataset of one million load-pattern profiles is released. The approach enables scalable generation of realistic residential load data, with practical implications for energy management systems, grid planning, and decarbonization efforts, while highlighting avenues for regional transfer, anomaly generation, and robustness enhancements.
Abstract
The scarcity of high-quality residential load data can pose obstacles for decarbonizing the residential sector as well as effective grid planning and operation. The above challenges have motivated research into generating synthetic load data, but existing methods faced limitations in terms of scalability, diversity, and similarity. This paper proposes a Generative Adversarial Network-based Synthetic Residential Load Pattern (RLP-GAN) generation model, a novel weakly-supervised GAN framework, leveraging an over-complete autoencoder to capture dependencies within complex and diverse load patterns and learn household-level data distribution at scale. We incorporate a model weight selection method to address the mode collapse problem and generate load patterns with high diversity. We develop a holistic evaluation method to validate the effectiveness of RLP-GAN using real-world data of 417 households. The results demonstrate that RLP-GAN outperforms state-of-the-art models in capturing temporal dependencies and generating load patterns with higher similarity to real data. Furthermore, we have publicly released the RLP-GAN generated synthetic dataset, which comprises one million synthetic residential load pattern profiles.
