Table of Contents
Fetching ...

Fast and Eager k-Medoids Clustering: O(k) Runtime Improvement of the PAM, CLARA, and CLARANS Algorithms

Erich Schubert, Peter J. Rousseeuw

TL;DR

This work tackles the computational bottleneck of k-medoids clustering (PAM) for non-Euclidean data by introducing an O(k)-fold speedup in the SWAP phase through caching and loop-structure rearrangements. It further advances the method with FasterPAM, which employs eager swapping to perform multiple beneficial swaps per iteration, reducing iterations while preserving solution quality. The approach extends naturally to CLARA and CLARANS, enabling faster, scalable clustering on large datasets and high k values, with reported runtime speedups up to 458x (k=100) and 1191x (k=200) in the SWAP phase. Extensive experiments across OR-Library datasets, plant-leaf textures, Optical Digits, and MNIST validate that FasterPAM maintains identical or near-identical TD solutions to PAM while delivering substantial practical gains, and the authors provide open-source implementations for broader adoption.

Abstract

Clustering non-Euclidean data is difficult, and one of the most used algorithms besides hierarchical clustering is the popular algorithm Partitioning Around Medoids (PAM), also simply referred to as k-medoids clustering. In Euclidean geometry the mean-as used in k-means-is a good estimator for the cluster center, but this does not exist for arbitrary dissimilarities. PAM uses the medoid instead, the object with the smallest dissimilarity to all others in the cluster. This notion of centrality can be used with any (dis-)similarity, and thus is of high relevance to many domains and applications. A key issue with PAM is its high run time cost. We propose modifications to the PAM algorithm that achieve an O(k)-fold speedup in the second ("SWAP") phase of the algorithm, but will still find the same results as the original PAM algorithm. If we relax the choice of swaps performed (while retaining comparable quality), we can further accelerate the algorithm by eagerly performing additional swaps in each iteration. With the substantially faster SWAP, we can now explore faster initialization strategies, because (i) the classic ("BUILD") initialization now becomes the bottleneck, and (ii) our swap is fast enough to compensate for worse starting conditions. We also show how the CLARA and CLARANS algorithms benefit from the proposed modifications. While we do not study the parallelization of our approach in this work, it can easily be combined with earlier approaches to use PAM and CLARA on big data (some of which use PAM as a subroutine, hence can immediately benefit from these improvements), where the performance with high k becomes increasingly important. In experiments on real data with k=100,200, we observed a 458x respectively 1191x speedup compared to the original PAM SWAP algorithm, making PAM applicable to larger data sets, and in particular to higher k.

Fast and Eager k-Medoids Clustering: O(k) Runtime Improvement of the PAM, CLARA, and CLARANS Algorithms

TL;DR

This work tackles the computational bottleneck of k-medoids clustering (PAM) for non-Euclidean data by introducing an O(k)-fold speedup in the SWAP phase through caching and loop-structure rearrangements. It further advances the method with FasterPAM, which employs eager swapping to perform multiple beneficial swaps per iteration, reducing iterations while preserving solution quality. The approach extends naturally to CLARA and CLARANS, enabling faster, scalable clustering on large datasets and high k values, with reported runtime speedups up to 458x (k=100) and 1191x (k=200) in the SWAP phase. Extensive experiments across OR-Library datasets, plant-leaf textures, Optical Digits, and MNIST validate that FasterPAM maintains identical or near-identical TD solutions to PAM while delivering substantial practical gains, and the authors provide open-source implementations for broader adoption.

Abstract

Clustering non-Euclidean data is difficult, and one of the most used algorithms besides hierarchical clustering is the popular algorithm Partitioning Around Medoids (PAM), also simply referred to as k-medoids clustering. In Euclidean geometry the mean-as used in k-means-is a good estimator for the cluster center, but this does not exist for arbitrary dissimilarities. PAM uses the medoid instead, the object with the smallest dissimilarity to all others in the cluster. This notion of centrality can be used with any (dis-)similarity, and thus is of high relevance to many domains and applications. A key issue with PAM is its high run time cost. We propose modifications to the PAM algorithm that achieve an O(k)-fold speedup in the second ("SWAP") phase of the algorithm, but will still find the same results as the original PAM algorithm. If we relax the choice of swaps performed (while retaining comparable quality), we can further accelerate the algorithm by eagerly performing additional swaps in each iteration. With the substantially faster SWAP, we can now explore faster initialization strategies, because (i) the classic ("BUILD") initialization now becomes the bottleneck, and (ii) our swap is fast enough to compensate for worse starting conditions. We also show how the CLARA and CLARANS algorithms benefit from the proposed modifications. While we do not study the parallelization of our approach in this work, it can easily be combined with earlier approaches to use PAM and CLARA on big data (some of which use PAM as a subroutine, hence can immediately benefit from these improvements), where the performance with high k becomes increasingly important. In experiments on real data with k=100,200, we observed a 458x respectively 1191x speedup compared to the original PAM SWAP algorithm, making PAM applicable to larger data sets, and in particular to higher k.

Paper Structure

This paper contains 29 sections, 12 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Problematic example for the alternating heuristic: the $k$-means style approach is stuck in the top solution, while the SWAP heuristic can reach better solutions by reassigning points during the swap.
  • Figure 2: Run time of PAM SWAP (SWAP only, without DAISY, without BUILD)
  • Figure 3: Run time comparison of different variations and derived algorithms.
  • Figure 4: Number of iterations and swaps
  • Figure 5: Loss ($\mathit{TD}$) and runtime with approximative methods
  • ...and 4 more figures