Unsupervised Clustering for Fault Analysis in High-Voltage Power Systems Using Voltage and Current Signals
Julian Oelhaf, Georg Kordowich, Andreas Maier, Johann Jager, Siming Bayer
TL;DR
This work tackles fault analysis in high-voltage power systems with unlabeled voltage-current waveforms by applying FFT-based frequency-domain features, dimensionality reduction via PCA and t-SNE, and clustering with K-Means on a real RTE dataset. The approach yields a scalable, data-driven workflow that partially aligns clusters with known fault types via expert labeling, despite the absence of extensive ground truth. Key contributions include a pronounced reduction of labeling burden, a comparative assessment of PCA- versus t-SNE-driven clustering, and insights into the suitability of unsupervised methods for fault-pattern discovery in large sensor datasets. The findings underscore the potential of unsupervised fault analysis to streamline inspection and support grid protection decisions, while acknowledging limitations and opportunities for improved validation and clustering methods.
Abstract
The widespread use of sensors in modern power grids has led to the accumulation of large amounts of voltage and current waveform data, especially during fault events. However, the lack of labeled datasets poses a significant challenge for fault classification and analysis. This paper explores the application of unsupervised clustering techniques for fault diagnosis in high-voltage power systems. A dataset provided by the Reseau de Transport d'Electricite (RTE) is analyzed, with frequency domain features extracted using the Fast Fourier Transform (FFT). The K-Means algorithm is then applied to identify underlying patterns in the data, enabling automated fault categorization without the need for labeled training samples. The resulting clusters are evaluated in collaboration with power system experts to assess their alignment with real-world fault characteristics. The results demonstrate the potential of unsupervised learning for scalable and data-driven fault analysis, providing a robust approach to detecting and classifying power system faults with minimal prior assumptions.
