Similarity and Dissimilarity Guided Co-association Matrix Construction for Ensemble Clustering

Xu Zhang; Yuheng Jia; Mofei Song; Ran Wang

Similarity and Dissimilarity Guided Co-association Matrix Construction for Ensemble Clustering

Xu Zhang, Yuheng Jia, Mofei Song, Ran Wang

TL;DR

The Similarity and Dissimilarity Guided Co-association matrix (SDGCA) is proposed, which introduces normalized ensemble entropy to estimate the quality of each cluster, and construct a similarity matrix based on this estimation and the adversarial relationship between the similarity matrix and the dissimilarity matrix is utilized to construct a promoted CA matrix for ensemble clustering.

Abstract

Ensemble clustering aggregates multiple weak clusterings to achieve a more accurate and robust consensus result. The Co-Association matrix (CA matrix) based method is the mainstream ensemble clustering approach that constructs the similarity relationships between sample pairs according the weak clustering partitions to generate the final clustering result. However, the existing methods neglect that the quality of cluster is related to its size, i.e., a cluster with smaller size tends to higher accuracy. Moreover, they also do not consider the valuable dissimilarity information in the base clusterings which can reflect the varying importance of sample pairs that are completely disconnected. To this end, we propose the Similarity and Dissimilarity Guided Co-association matrix (SDGCA) to achieve ensemble clustering. First, we introduce normalized ensemble entropy to estimate the quality of each cluster, and construct a similarity matrix based on this estimation. Then, we employ the random walk to explore high-order proximity of base clusterings to construct a dissimilarity matrix. Finally, the adversarial relationship between the similarity matrix and the dissimilarity matrix is utilized to construct a promoted CA matrix for ensemble clustering. We compared our method with 13 state-of-the-art methods across 12 datasets, and the results demonstrated the superiority clustering ability and robustness of the proposed approach. The code is available at https://github.com/xuz2019/SDGCA.

Similarity and Dissimilarity Guided Co-association Matrix Construction for Ensemble Clustering

TL;DR

Abstract

Similarity and Dissimilarity Guided Co-association Matrix Construction for Ensemble Clustering

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)