Table of Contents
Fetching ...

Kernel Correlation-Dissimilarity for Multiple Kernel k-Means Clustering

Rina Su, Yu Guo, Caiying Wu, Qiyu Jin, Tieyong Zeng

TL;DR

This work introduces a novel method that systematically integrates both kernel correlation and dissimilarity, and offers a more objective and transparent strategy for extracting non-linear information and significantly improving clustering precision.

Abstract

The main objective of the Multiple Kernel k-Means (MKKM) algorithm is to extract non-linear information and achieve optimal clustering by optimizing base kernel matrices. Current methods enhance information diversity and reduce redundancy by exploiting interdependencies among multiple kernels based on correlations or dissimilarities. Nevertheless, relying solely on a single metric, such as correlation or dissimilarity, to define kernel relationships introduces bias and incomplete characterization. Consequently, this limitation hinders efficient information extraction, ultimately compromising clustering performance. To tackle this challenge, we introduce a novel method that systematically integrates both kernel correlation and dissimilarity. Our approach comprehensively captures kernel relationships, facilitating more efficient classification information extraction and improving clustering performance. By emphasizing the coherence between kernel correlation and dissimilarity, our method offers a more objective and transparent strategy for extracting non-linear information and significantly improving clustering precision, supported by theoretical rationale. We assess the performance of our algorithm on 13 challenging benchmark datasets, demonstrating its superiority over contemporary state-of-the-art MKKM techniques.

Kernel Correlation-Dissimilarity for Multiple Kernel k-Means Clustering

TL;DR

This work introduces a novel method that systematically integrates both kernel correlation and dissimilarity, and offers a more objective and transparent strategy for extracting non-linear information and significantly improving clustering precision.

Abstract

The main objective of the Multiple Kernel k-Means (MKKM) algorithm is to extract non-linear information and achieve optimal clustering by optimizing base kernel matrices. Current methods enhance information diversity and reduce redundancy by exploiting interdependencies among multiple kernels based on correlations or dissimilarities. Nevertheless, relying solely on a single metric, such as correlation or dissimilarity, to define kernel relationships introduces bias and incomplete characterization. Consequently, this limitation hinders efficient information extraction, ultimately compromising clustering performance. To tackle this challenge, we introduce a novel method that systematically integrates both kernel correlation and dissimilarity. Our approach comprehensively captures kernel relationships, facilitating more efficient classification information extraction and improving clustering performance. By emphasizing the coherence between kernel correlation and dissimilarity, our method offers a more objective and transparent strategy for extracting non-linear information and significantly improving clustering precision, supported by theoretical rationale. We assess the performance of our algorithm on 13 challenging benchmark datasets, demonstrating its superiority over contemporary state-of-the-art MKKM techniques.
Paper Structure (26 sections, 1 theorem, 30 equations, 4 figures, 9 tables, 1 algorithm)

This paper contains 26 sections, 1 theorem, 30 equations, 4 figures, 9 tables, 1 algorithm.

Key Result

Lemma 4.1

Problem (eq22) is a convex quadratic programming problem.

Figures (4)

  • Figure 1: The results of the Friedman test conducted on all comparison algorithms for the 4 clustering evaluation metrics: (a) ACC, (b) NMI, (c) PUR, and (d) ARI. The values depicted in the graph represent the average rank of the algorithms. The black and red lines correspond to the algorithms that overlap and non-overlap with our algorithm, which is represented by the blue lines, respectively.
  • Figure 2: The trends in the 4 clustering metrics for all algorithms with an increasing number of sample elements are examined on the flower17 and CCV datasets.
  • Figure 3: A comparison of the learned kernel weights among different algorithms is conducted on 2 datasets.
  • Figure 4: The iterative trend of the objective function for the proposed model is examined on 6 datasets.

Theorems & Definitions (2)

  • Lemma 4.1
  • proof