PowerPM: Foundation Model for Power Systems
Shihao Tu, Yupeng Zhang, Jing Zhang, Zhendong Fu, Yin Zhang, Yang Yang
TL;DR
PowerPM addresses the challenge of learning a generic representation for diverse Electricity Time Series (ETS) data by integrating a temporal encoder and a hierarchical encoder within a hierarchical graph framework. It employs a self-supervised pretraining paradigm that blends masked ETS modeling with dual-view contrastive learning to capture intra-window temporal dynamics and inter-window discrepancies. The model, containing approximately $250$M parameters, is pre-trained on large-scale hierarchical ETS data (~$987.42$GB) and achieves state-of-the-art or near-SOTA performance across 44 downstream tasks on private data and demonstrates strong generalization to public ETS datasets. This foundation model approach significantly enhances sample- and label-efficiency for power-system applications like demand-side management, grid stability, and consumer behavior analysis, offering an off-the-shelf tool for practitioners and researchers.
Abstract
The emergence of abundant electricity time series (ETS) data provides ample opportunities for various applications in the power systems, including demand-side management, grid stability, and consumer behavior analysis. Deep learning models have advanced ETS modeling by effectively capturing sequence dependence. Nevertheless, learning a generic representation of ETS data for various applications remains challenging due to the inherently complex hierarchical structure of ETS data. Moreover, ETS data exhibits intricate temporal dependencies and is suscepti ble to the influence of exogenous variables. Furthermore, different instances exhibit diverse electricity consumption behavior. In this paper, we propose a foundation model PowerPM to model ETS data, providing a large-scale, off-the-shelf model for power systems. PowerPM consists of a temporal encoder and a hierarchical encoder. The temporal encoder captures both temporal dependencies in ETS data, considering exogenous variables. The hierarchical encoder models the correlation between hierarchy. Furthermore, PowerPM leverages a novel self-supervised pretraining framework consisting of masked ETS modeling and dual-view contrastive learning, which enable PowerPM to capture temporal dependency within ETS windows and aware the discrepancy across ETS windows, providing two different perspectives to learn generic representation. Our experiments involve five real world scenario datasets, comprising private and public data. Through pre-training on massive ETS data, PowerPM achieves SOTA performance on diverse downstream tasks within the private dataset. Impressively, when transferred to the public datasets, PowerPM maintains its superiority, showcasing its remarkable generalization ability across various tasks and domains. Moreover, ablation studies, few-shot experiments provide additional evidence of the effectiveness of our model.
