The recursive scheme of clustering
Alicja Miniak-Górecka, Krzysztof Podlaski, Tomasz Gwizdałła
TL;DR
The paper tackles clustering of noisy climatological and experimental data where traditional methods struggle to reflect expert classifications. It introduces a recursive scheme that combines Savitzky-Golay smoothing of histograms with standard clustering algorithms ($k$-means and SOM) to automatically determine the number of clusters and refine partitions. Across ground temperature, water level, and Banknote entropy datasets, the method shows better alignment with expert assessments and reveals structure overlooked by conventional approaches. This approach offers a robust, multi-phase framework for clustering noisy measurements with practical applicability to environmental monitoring and related domains.
Abstract
The problem of data clustering is one of the most important in data analysis. It can be problematic when dealing with experimental data characterized by measurement uncertainties and errors. Our paper proposes a recursive scheme for clustering data obtained in geographical (climatological) experiments. The discussion of results obtained by k-means and SOM methods with the developed recursive procedure is presented. We show that the clustering using the new approach gives more acceptable results when compared to experts assessments.
