Table of Contents
Fetching ...

Fuzzy K-Means Clustering without Cluster Centroids

Yichen Bao, Han Lu, Quanxue Gao

TL;DR

A novel Fuzzy K-Means clustering algorithm is proposed that entirely eliminates the reliance on cluster centroids, obtaining membership metrics solely through distance matrix computation, which enhances flexibility in distance measurement between sample points, thus improving the algorithm's performance and robustness.

Abstract

Fuzzy K-Means clustering is a critical technique in unsupervised data analysis. Unlike traditional hard clustering algorithms such as K-Means, it allows data points to belong to multiple clusters with varying degrees of membership, determined through iterative optimization to establish optimal cluster centers and memberships, thereby achieving fuzzy partitioning of data. However, the performance of popular Fuzzy K-Means algorithms is sensitive to the selection of initial cluster centroids and is also affected by noise when updating mean cluster centroids. To address these challenges, this paper proposes a novel Fuzzy \textit{K}-Means clustering algorithm that entirely eliminates the reliance on cluster centroids, obtaining membership metrics solely through distance matrix computation. This innovation enhances flexibility in distance measurement between sample points, thus improving the algorithm's performance and robustness. The paper also establishes theoretical connections between the proposed model and popular Fuzzy K-Means clustering techniques. Experimental results on several real datasets demonstrate the effectiveness of the algorithm.

Fuzzy K-Means Clustering without Cluster Centroids

TL;DR

A novel Fuzzy K-Means clustering algorithm is proposed that entirely eliminates the reliance on cluster centroids, obtaining membership metrics solely through distance matrix computation, which enhances flexibility in distance measurement between sample points, thus improving the algorithm's performance and robustness.

Abstract

Fuzzy K-Means clustering is a critical technique in unsupervised data analysis. Unlike traditional hard clustering algorithms such as K-Means, it allows data points to belong to multiple clusters with varying degrees of membership, determined through iterative optimization to establish optimal cluster centers and memberships, thereby achieving fuzzy partitioning of data. However, the performance of popular Fuzzy K-Means algorithms is sensitive to the selection of initial cluster centroids and is also affected by noise when updating mean cluster centroids. To address these challenges, this paper proposes a novel Fuzzy \textit{K}-Means clustering algorithm that entirely eliminates the reliance on cluster centroids, obtaining membership metrics solely through distance matrix computation. This innovation enhances flexibility in distance measurement between sample points, thus improving the algorithm's performance and robustness. The paper also establishes theoretical connections between the proposed model and popular Fuzzy K-Means clustering techniques. Experimental results on several real datasets demonstrate the effectiveness of the algorithm.
Paper Structure (24 sections, 1 theorem, 26 equations, 2 figures, 4 tables, 1 algorithm)

This paper contains 24 sections, 1 theorem, 26 equations, 2 figures, 4 tables, 1 algorithm.

Key Result

Lemma 1

Let $d_{ij} = \left\|\textbf{x}_i -\textbf{x}_j\right\|_2^2$ is the distance, $\textbf{Y}$ is the membership matrix and $\sum_j y_{ij}= 1$, $y_{ij} \ge 0$, then the following formula holds

Figures (2)

  • Figure 1: Performance of the FKMWC vs. $\lambda$ on 6 benchmark datasets with squared Euclidean distance (Ours_d) and K-nearest neighbor distance (Ours_kd).
  • Figure 2: The value of objective function (\ref{['fkmwc']}) and clustering performance with iterations on 6 benchmark datasets.

Theorems & Definitions (2)

  • Lemma 1
  • Proof 1