Gödel Number based Clustering Algorithm with Decimal First Degree Cellular Automata
Vicky Vikrant, Narodia Parth P, Kamalika Bhattacharjee
TL;DR
This work tackles clustering high-dimensional real-valued data by encoding features with Gödel-number based encoding into decimal strings and applying reversible decimal FDCA for clustering. Clusters emerge from reachability among cyclic configurations, with a three-stage iterative process to merge cycles into the desired number of clusters, guided by three merging metrics and a carefully curated rule set (36 rules in total). Empirical results on four UCI datasets show competitive performance against K-Means, Hierarchical, and other baselines, with the maximum participation score merging often delivering superior Silhouette scores. The approach offers a scalable, parallelizable alternative that preserves feature properties through Gödel encoding and leverages CA dynamics for clustering insights.
Abstract
In this paper, a decimal first degree cellular automata (FDCA) based clustering algorithm is proposed where clusters are created based on reachability. Cyclic spaces are created and configurations which are in the same cycle are treated as the same cluster. Here, real-life data objects are encoded into decimal strings using Gödel number based encoding. The benefits of the scheme is, it reduces the encoded string length while maintaining the features properties. Candidate CA rules are identified based on some theoretical criteria such as self-replication and information flow. An iterative algorithm is developed to generate the desired number of clusters over three stages. The results of the clustering are evaluated based on benchmark clustering metrics such as Silhouette score, Davis Bouldin, Calinski Harabasz and Dunn Index. In comparison with the existing state-of-the-art clustering algorithms, our proposed algorithm gives better performance.
