Evolving Restricted Boltzmann Machine-Kohonen Network for Online Clustering
J. Senthilnath, Adithya Bhattiprolu, Ankur Singh, Bangjian Zhou, Min Wu, Jón Atli Benediktsson, Xiaoli Li
TL;DR
This work addresses online clustering of streaming unlabeled data by proposing ERBM-KNet, a unified framework that jointly learns latent representations with an evolving RBM and performs autonomous online clustering with a Kohonen network. The ERBM employs a bias-variance driven growth/pruning strategy via Network Significance to adapt its architecture on the fly, while the KNet predicts the number of clusters and updates cluster centers in a single pass. Across five datasets, including a semiconductor wafer defect dataset, ERBM-KNet demonstrates superior clustering quality (NMI and Purity) and efficient reconstruction with far fewer latent neurons than competing methods, while automatically determining the appropriate number of clusters. The results highlight robust, scalable online clustering capabilities suitable for real-time streaming applications in vision and industrial domains.
Abstract
A novel online clustering algorithm is presented where an Evolving Restricted Boltzmann Machine (ERBM) is embedded with a Kohonen Network called ERBM-KNet. The proposed ERBM-KNet efficiently handles streaming data in a single-pass mode using the ERBM, employing a bias-variance strategy for neuron growing and pruning, as well as online clustering based on a cluster update strategy for cluster prediction and cluster center update using KNet. Initially, ERBM evolves its architecture while processing unlabeled image data, effectively disentangling the data distribution in the latent space. Subsequently, the KNet utilizes the feature extracted from ERBM to predict the number of clusters and updates the cluster centers. By overcoming the common challenges associated with clustering algorithms, such as prior initialization of the number of clusters and subpar clustering accuracy, the proposed ERBM-KNet offers significant improvements. Extensive experimental evaluations on four benchmarks and one industry dataset demonstrate the superiority of ERBM-KNet compared to state-of-the-art approaches.
