Table of Contents
Fetching ...

A new validity measure for fuzzy c-means clustering

Dae-Won Kim, Kwang H. Lee

TL;DR

This work tackles the problem of validating fuzzy partitions produced by fuzzy c-means and selecting the number of clusters $c$ by introducing a proximity-based cluster validity index. The proposed measure, $V_{proposed}(U,V:X)$, represents each fuzzy cluster as a fuzzy set and computes inter-cluster proximity across all cluster pairs to capture both overlap and inverse separation, with lower values indicating better partitions. By minimizing $V_{proposed}$ over $c\in\{2,...,c_{max}\}$, the method chooses the optimal partition. Experimental results on five benchmark datasets show that $V_{proposed}$ outperforms seven established indexes in identifying the appropriate number of clusters, suggesting a more reliable, geometry-aware approach to fuzzy-clustering validation and potential extensions to color image clustering.

Abstract

A new cluster validity index is proposed for fuzzy clusters obtained from fuzzy c-means algorithm. The proposed validity index exploits inter-cluster proximity between fuzzy clusters. Inter-cluster proximity is used to measure the degree of overlap between clusters. A low proximity value refers to well-partitioned clusters. The best fuzzy c-partition is obtained by minimizing inter-cluster proximity with respect to c. Well-known data sets are tested to show the effectiveness and reliability of the proposed index.

A new validity measure for fuzzy c-means clustering

TL;DR

This work tackles the problem of validating fuzzy partitions produced by fuzzy c-means and selecting the number of clusters by introducing a proximity-based cluster validity index. The proposed measure, , represents each fuzzy cluster as a fuzzy set and computes inter-cluster proximity across all cluster pairs to capture both overlap and inverse separation, with lower values indicating better partitions. By minimizing over , the method chooses the optimal partition. Experimental results on five benchmark datasets show that outperforms seven established indexes in identifying the appropriate number of clusters, suggesting a more reliable, geometry-aware approach to fuzzy-clustering validation and potential extensions to color image clustering.

Abstract

A new cluster validity index is proposed for fuzzy clusters obtained from fuzzy c-means algorithm. The proposed validity index exploits inter-cluster proximity between fuzzy clusters. Inter-cluster proximity is used to measure the degree of overlap between clusters. A low proximity value refers to well-partitioned clusters. The best fuzzy c-partition is obtained by minimizing inter-cluster proximity with respect to c. Well-known data sets are tested to show the effectiveness and reliability of the proposed index.
Paper Structure (7 sections, 12 equations, 3 figures, 3 tables)

This paper contains 7 sections, 12 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Two different fuzzy partitions $(U^{(a)},V^{(a)})$ and $(U^{(b)},V^{(b)})$ containing the same separation distance between cluster centroids
  • Figure 2: Proximity values $f(\mu)$ at membership degree $\mu$ between two fuzzy clusters
  • Figure 3: (a) BENSAID data set (optimal $c$ is three) (b) STARFIELD data set (optimal $c$ is eight)

Theorems & Definitions (2)

  • Definition 1: Inter-cluster proximity
  • Definition 2: Proposed validity index