HiSAXy: A fast methodology for solar wind structure identification in millions of time series
Hala Lamdouar, Sairam Sundaresan, Anna Jungbluth, Sudeshna Boro Saikia, Amanda Joy Camarata, Nathan Miles, Marcella Scoczynski, Mavis Stone, Andrés Muñoz-Jaramillo, Ayris Narock, Adam Szabo
TL;DR
The paper addresses the challenge of scalable, unsupervised identification of frequently occurring magnetic structures in the interplanetary magnetic field carried by the solar wind. It introduces HiSAXy, a hybrid clustering approach that combines indexable iSAX time-series representation with HDBSCAN to enable fast indexing and robust clustering of millions of IMF segments. Empirical results show that HiSAXy identifies larger, coherent clusters while maintaining intracluster self-similarity, and significantly reduces the human effort required to label discontinuities, with reported time savings on the order of hundreds of hours. This work enables scalable discovery and interpretation of solar wind structures in large PSP data and is poised to support analyses across multiple timescales and solar wind properties.
Abstract
We present a hybridized unsupervised clustering algorithm Hisaxy as a novel way to identify frequently occurring magnetic structures embedded in the interplanetary magnetic field (IMF) carried by the solar wind. The Hisaxy algorithm utilizes a combination of indexable Symbolic Aggregate approXimation (iSAX) and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) to efficiently identify clusters of patterns embedded in time series data. We utilized Hisaxy to identify small-scale structures, known as discontinuities, embedded in time series measurements of the IMF. In doing so, we demonstrate the capability of the algorithm to significantly reduce the amount of human analysis hours required to identify these structures, all the while maintaining a high degree of self similarity within a given cluster of time series data.
