Table of Contents
Fetching ...

Lower Dimensional Spherical Representation of Medium Voltage Load Profiles for Visualization, Outlier Detection, and Generative Modelling

Edgar Mauricio Salazar Duque, Bart van der Holst, Pedro P. Vergara, Juan S. Giraldo, Phuong H. Nguyen, Anne Van der Molen, Han, Slootweg

TL;DR

The theoretical and practical foundation of a spherical lower dimensional representation for daily medium voltage load profiles is presented, based on principal component analysis, to unify and simplify the tasks for clustering visualisation, outlier detection and generative profile modelling under one concept.

Abstract

This paper presents the spherical lower dimensional representation for daily medium voltage load profiles, based on principal component analysis. The objective is to unify and simplify the tasks for (i) clustering visualisation, (ii) outlier detection and (iii) generative profile modelling under one concept. The lower dimensional projection of standardised load profiles unveils a latent distribution in a three-dimensional sphere. This spherical structure allows us to detect outliers by fitting probability distribution models in the spherical coordinate system, identifying measurements that deviate from the spherical shape. The same latent distribution exhibits an arc shape, suggesting an underlying order among load profiles. We develop a principal curve technique to uncover this order based on similarity, offering new advantages over conventional clustering techniques. This finding reveals that energy consumption in a wide region can be seen as a continuously changing process. Furthermore, we combined the principal curve with a von Mises-Fisher distribution to create a model capable of generating profiles with continuous mixtures between clusters. The presence of the spherical distribution is validated with data from four municipalities in the Netherlands. The uncovered spherical structure implies the possibility of employing new mathematical tools from directional statistics and differential geometry for load profile modelling.

Lower Dimensional Spherical Representation of Medium Voltage Load Profiles for Visualization, Outlier Detection, and Generative Modelling

TL;DR

The theoretical and practical foundation of a spherical lower dimensional representation for daily medium voltage load profiles is presented, based on principal component analysis, to unify and simplify the tasks for clustering visualisation, outlier detection and generative profile modelling under one concept.

Abstract

This paper presents the spherical lower dimensional representation for daily medium voltage load profiles, based on principal component analysis. The objective is to unify and simplify the tasks for (i) clustering visualisation, (ii) outlier detection and (iii) generative profile modelling under one concept. The lower dimensional projection of standardised load profiles unveils a latent distribution in a three-dimensional sphere. This spherical structure allows us to detect outliers by fitting probability distribution models in the spherical coordinate system, identifying measurements that deviate from the spherical shape. The same latent distribution exhibits an arc shape, suggesting an underlying order among load profiles. We develop a principal curve technique to uncover this order based on similarity, offering new advantages over conventional clustering techniques. This finding reveals that energy consumption in a wide region can be seen as a continuously changing process. Furthermore, we combined the principal curve with a von Mises-Fisher distribution to create a model capable of generating profiles with continuous mixtures between clusters. The presence of the spherical distribution is validated with data from four municipalities in the Netherlands. The uncovered spherical structure implies the possibility of employing new mathematical tools from directional statistics and differential geometry for load profile modelling.

Paper Structure

This paper contains 15 sections, 3 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Decomposition of subset of $\boldsymbol{P}$ in its elementary matrices (\ref{['eq:elementary']}). (a) Original subset $\boldsymbol{P}$. (b) Standardised profiles $\boldsymbol{\hat{P}}$ using (\ref{['eq:standardization']}). (c)-(e) The first three most significant elementary matrix profiles are green. i.e., $\boldsymbol{X}_1, \boldsymbol{X}_2, \boldsymbol{X}_3$, with their respective eigenvector components in a solid black line. Less significant elementary matrices, i.e., $\boldsymbol{X}_{10}, \boldsymbol{X}_{11}, \boldsymbol{X}_{12}$, are shown in orange, for the eigenvectors (f) $\boldsymbol{v}_{10}$, (g) $\boldsymbol{v}_{11}$, and (h) $\boldsymbol{v}_{12}$. (i) Blue bars show the explained variance by the most important eigenvalues; the solid line is the CEV. The plot is truncated to 12 eigenvalues out of 96.
  • Figure 2: The values of the projection $\boldsymbol{Z}$ in a 3-dimensional space. (a-c) Orthographic projection of the sphere. Each blue point represents a single transformer's daily profile. The sphere overlayed in the data is found via (\ref{['eq:optimization']}).
  • Figure 3: Probability distributions of the spherical projection variables for the dataset $\boldsymbol{X}$. (a) Azimuthal angle distribution centred around the mean. (b) Radius. (c) Polar angle. (d) 2D projection using angle values. Flagged points in red are outliers, which are the points that fall outside the rejection region created by the 95% confidence interval (CI) from each fitted distribution (delineated by the vertical and horizontal red dotted lines).
  • Figure 4: Example of latent space ordering for the process in (\ref{['eq:process']}). (a1) Data matrix $\boldsymbol{H}$ created by discretisation of (\ref{['eq:process']}) using 20 time steps. (a2) Heatmap for standardised matrix $\boldsymbol{\hat{Y}}$. (a3) Heatmap similarity matrix $\boldsymbol{S_{\boldsymbol{Y}}}$. (a4) and (a5) spherical orthographic projection for the PCoA applied to $\boldsymbol{S_{\boldsymbol{Y}}}$ using three principal components which shows a clear arc-shape structure. (b1) $\boldsymbol{H}$ with shuffled rows. (b2) The gradual change pattern is lost due to shuffling. (b3) The banded structure of $\boldsymbol{S_{\boldsymbol{Y}}}$ is absent. (b4) and (b5) The latent ordering of the samples still exists, and the original sample labels ($i$) can be recovered following the parametrised curve that passes through the middle of the points.
  • Figure 5: Clustered profiles from Municipality 1 and outlier identification. (a-c) Visualisation for clustering results and outliers using spherical modelling. Daily Profiles that are not considered outliers are grouped into four: (d) Commercial/Offices (e) Mix of residential and commercial (f) Residential (g) Residential with high PV penetration. (h-k) Anomalous reading labelled by outlier models from section \ref{['sec:outlier']}. (j) The scatter points on the top of the sphere marked as outliers are a cluster of their own.
  • ...and 3 more figures