Fuzzy K-Means Clustering without Cluster Centroids

Yichen Bao; Han Lu; Quanxue Gao

Fuzzy K-Means Clustering without Cluster Centroids

Yichen Bao, Han Lu, Quanxue Gao

TL;DR

A novel Fuzzy K-Means clustering algorithm is proposed that entirely eliminates the reliance on cluster centroids, obtaining membership metrics solely through distance matrix computation, which enhances flexibility in distance measurement between sample points, thus improving the algorithm's performance and robustness.

Abstract

Fuzzy K-Means clustering is a critical technique in unsupervised data analysis. Unlike traditional hard clustering algorithms such as K-Means, it allows data points to belong to multiple clusters with varying degrees of membership, determined through iterative optimization to establish optimal cluster centers and memberships, thereby achieving fuzzy partitioning of data. However, the performance of popular Fuzzy K-Means algorithms is sensitive to the selection of initial cluster centroids and is also affected by noise when updating mean cluster centroids. To address these challenges, this paper proposes a novel Fuzzy \textit{K}-Means clustering algorithm that entirely eliminates the reliance on cluster centroids, obtaining membership metrics solely through distance matrix computation. This innovation enhances flexibility in distance measurement between sample points, thus improving the algorithm's performance and robustness. The paper also establishes theoretical connections between the proposed model and popular Fuzzy K-Means clustering techniques. Experimental results on several real datasets demonstrate the effectiveness of the algorithm.

Fuzzy K-Means Clustering without Cluster Centroids

TL;DR

Abstract

Paper Structure (24 sections, 1 theorem, 26 equations, 2 figures, 4 tables, 1 algorithm)

This paper contains 24 sections, 1 theorem, 26 equations, 2 figures, 4 tables, 1 algorithm.

Introduction
Related Works
K-Means Clustering
Fuzzy K-Means Clustering
Fuzzy K-Means Clustering
Robust and Sparse Fuzzy K-Means Clustering
Methodology
Problem Formulation and objective
Distance matrix
Squared Euclidean distance
K-Nearest Neighbor Distance
Butterworth Distance
Kernel Distance
Optimization Algorithm
Computational Complexity Analysis
...and 9 more sections

Key Result

Lemma 1

Let $d_{ij} = \left\|\textbf{x}_i -\textbf{x}_j\right\|_2^2$ is the distance, $\textbf{Y}$ is the membership matrix and $\sum_j y_{ij}= 1$, $y_{ij} \ge 0$, then the following formula holds

Figures (2)

Figure 1: Performance of the FKMWC vs. $\lambda$ on 6 benchmark datasets with squared Euclidean distance (Ours_d) and K-nearest neighbor distance (Ours_kd).
Figure 2: The value of objective function (\ref{['fkmwc']}) and clustering performance with iterations on 6 benchmark datasets.

Theorems & Definitions (2)

Lemma 1
Proof 1

Fuzzy K-Means Clustering without Cluster Centroids

TL;DR

Abstract

Fuzzy K-Means Clustering without Cluster Centroids

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (2)