Table of Contents
Fetching ...

A Novel Vision Transformer based Load Profile Analysis using Load Images as Inputs

Hyeonjin Kim, Yi Hu, Kai Ye, Ning Lu

TL;DR

ViT4LPA addresses data scarcity in Load Profile Analysis by introducing a Vision Transformer–based encoder pre-trained on load-images derived from multi-modal smart-meter data. Time-series profiles are converted into 24×24 load images with 3 channels and learned via masked image modeling across a dataset of about 4,000 per-year profile sets from the Pecan Street project. Downstream tasks include behind-the-meter load identification and HVAC disaggregation; ViT4LPA outperforms CNN/LSTM baselines, with larger gains when labeled data are limited. Attention analyses reveal interpretable information flow, and the work suggests future tuning of masking strategies and patch sizes to further boost performance.

Abstract

This paper introduces ViT4LPA, an innovative Vision Transformer (ViT) based approach for Load Profile Analysis (LPA). We transform time-series load profiles into load images. This allows us to leverage the ViT architecture, originally designed for image processing, as a pre-trained image encoder to uncover latent patterns within load data. ViT is pre-trained using an extensive load image dataset, comprising 1M load images derived from smart meter data collected over a two-year period from 2,000 residential users. The training methodology is self-supervised, masked image modeling, wherein masked load images are restored to reveal hidden relationships among image patches. The pre-trained ViT encoder is then applied to various downstream tasks, including the identification of electric vehicle (EV) charging loads and behind-the-meter solar photovoltaic (PV) systems and load disaggregation. Simulation results illustrate ViT4LPA's superior performance compared to existing neural network models in downstream tasks. Additionally, we conduct an in-depth analysis of the attention weights within the ViT4LPA model to gain insights into its information flow mechanisms.

A Novel Vision Transformer based Load Profile Analysis using Load Images as Inputs

TL;DR

ViT4LPA addresses data scarcity in Load Profile Analysis by introducing a Vision Transformer–based encoder pre-trained on load-images derived from multi-modal smart-meter data. Time-series profiles are converted into 24×24 load images with 3 channels and learned via masked image modeling across a dataset of about 4,000 per-year profile sets from the Pecan Street project. Downstream tasks include behind-the-meter load identification and HVAC disaggregation; ViT4LPA outperforms CNN/LSTM baselines, with larger gains when labeled data are limited. Attention analyses reveal interpretable information flow, and the work suggests future tuning of masking strategies and patch sizes to further boost performance.

Abstract

This paper introduces ViT4LPA, an innovative Vision Transformer (ViT) based approach for Load Profile Analysis (LPA). We transform time-series load profiles into load images. This allows us to leverage the ViT architecture, originally designed for image processing, as a pre-trained image encoder to uncover latent patterns within load data. ViT is pre-trained using an extensive load image dataset, comprising 1M load images derived from smart meter data collected over a two-year period from 2,000 residential users. The training methodology is self-supervised, masked image modeling, wherein masked load images are restored to reveal hidden relationships among image patches. The pre-trained ViT encoder is then applied to various downstream tasks, including the identification of electric vehicle (EV) charging loads and behind-the-meter solar photovoltaic (PV) systems and load disaggregation. Simulation results illustrate ViT4LPA's superior performance compared to existing neural network models in downstream tasks. Additionally, we conduct an in-depth analysis of the attention weights within the ViT4LPA model to gain insights into its information flow mechanisms.
Paper Structure (11 sections, 7 figures, 3 tables)

This paper contains 11 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: An illustration of the ViT4LPA architecture. (a) Profile-to-image conversion, (b) ViT4LPA workflow, and (c) Pre-training process.
  • Figure 2: ViT4LPA encoder performance evaluation. (a) An illustration of the original, masked, and reconstructed load image, and (b) Reconstruction error distribution.
  • Figure 3: Analysis on position embeddings of the ViT4LPA encoder. (a) Heatmaps of the 36 similarity matrices and (b) Similarity matrix of image patch 1.
  • Figure 4: Heatmaps of the mean of the self-attention layers. (a) Layer 1, (b) Layer 2, and (c) Layer 3.
  • Figure 5: Comparison between identification models with or without pre-trained encoder network. (a) PV generation identification, and (b) EV load identification.
  • ...and 2 more figures