Table of Contents
Fetching ...

Breaking the Reclustering Barrier in Centroid-based Deep Clustering

Lukas Miklautz, Timo Klein, Kevin Sidak, Collin Leiber, Thomas Lang, Andrii Shkabrii, Sebastian Tschiatschek, Claudia Plant

TL;DR

Centroid-based deep clustering often hits a performance ceiling known as the reclustering barrier, where periodic reclustering fails to meaningfully alter the latent space or improve results. BRB addresses this by coupling soft weight resets with reclustering, plus optional momentum resets, to incur structured, persistent perturbations that expand the space of clustering targets while preserving knowledge. Across eight datasets and multiple DC baselines (DEC, IDEC, DCN), BRB yields consistent gains, enables training from scratch, and, when combined with contrastive learning (e.g., SimCLR, SCAN), reaches or surpasses state-of-the-art performance on challenging benchmarks. The approach is lightweight, broadly applicable, and provides new insights into how exploration, knowledge preservation, and target adaptation can overcome early optimization barriers in deep clustering.

Abstract

This work investigates an important phenomenon in centroid-based deep clustering (DC) algorithms: Performance quickly saturates after a period of rapid early gains. Practitioners commonly address early saturation with periodic reclustering, which we demonstrate to be insufficient to address performance plateaus. We call this phenomenon the "reclustering barrier" and empirically show when the reclustering barrier occurs, what its underlying mechanisms are, and how it is possible to Break the Reclustering Barrier with our algorithm BRB. BRB avoids early over-commitment to initial clusterings and enables continuous adaptation to reinitialized clustering targets while remaining conceptually simple. Applying our algorithm to widely-used centroid-based DC algorithms, we show that (1) BRB consistently improves performance across a wide range of clustering benchmarks, (2) BRB enables training from scratch, and (3) BRB performs competitively against state-of-the-art DC algorithms when combined with a contrastive loss. We release our code and pre-trained models at https://github.com/Probabilistic-and-Interactive-ML/breaking-the-reclustering-barrier .

Breaking the Reclustering Barrier in Centroid-based Deep Clustering

TL;DR

Centroid-based deep clustering often hits a performance ceiling known as the reclustering barrier, where periodic reclustering fails to meaningfully alter the latent space or improve results. BRB addresses this by coupling soft weight resets with reclustering, plus optional momentum resets, to incur structured, persistent perturbations that expand the space of clustering targets while preserving knowledge. Across eight datasets and multiple DC baselines (DEC, IDEC, DCN), BRB yields consistent gains, enables training from scratch, and, when combined with contrastive learning (e.g., SimCLR, SCAN), reaches or surpasses state-of-the-art performance on challenging benchmarks. The approach is lightweight, broadly applicable, and provides new insights into how exploration, knowledge preservation, and target adaptation can overcome early optimization barriers in deep clustering.

Abstract

This work investigates an important phenomenon in centroid-based deep clustering (DC) algorithms: Performance quickly saturates after a period of rapid early gains. Practitioners commonly address early saturation with periodic reclustering, which we demonstrate to be insufficient to address performance plateaus. We call this phenomenon the "reclustering barrier" and empirically show when the reclustering barrier occurs, what its underlying mechanisms are, and how it is possible to Break the Reclustering Barrier with our algorithm BRB. BRB avoids early over-commitment to initial clusterings and enables continuous adaptation to reinitialized clustering targets while remaining conceptually simple. Applying our algorithm to widely-used centroid-based DC algorithms, we show that (1) BRB consistently improves performance across a wide range of clustering benchmarks, (2) BRB enables training from scratch, and (3) BRB performs competitively against state-of-the-art DC algorithms when combined with a contrastive loss. We release our code and pre-trained models at https://github.com/Probabilistic-and-Interactive-ML/breaking-the-reclustering-barrier .

Paper Structure

This paper contains 72 sections, 20 equations, 19 figures, 19 tables, 2 algorithms.

Figures (19)

  • Figure 1: Breaking the reclustering barrier with BRB. Uniting the deep clustering method IDEC with BRB (IDEC+BRB) breaks through the performance barrier (dashed line) encountered when just using IDEC or combining it with (IDEC+Recluster). Performance is measured by clustering accuracy on USPS.
  • Figure 2: Why does reclustering not work? DCN with reclustering (orange line) shows minimal changes compared to unmodified DCN (blue line) in the embedded space (intra/inter-CD) or explored clusterings (CL Change) for GTSRB. This effect generalizes to other datasets (Figure \ref{['fig:main_analysis']}).
  • Figure 3: The effect of BRB during training of DEC.(1) Pre-BRB: Before BRB is applied, DEC strongly compresses (low variation within clusters) the five shown clusters and mixes different ground truth classes of USPS (indicated by color), leading to a performance plateau (cf. epochs 46 to 50 in the NMI plot). (2) BRB applies weight reset to increase the variation in the clusters with subsequent reclustering, which leads, after a small performance drop in NMI (epoch 51), to a steep increase (epochs 52 to 59). (3) Post-BRB: After applying BRB, the clusters are compressed again by DEC until BRB is applied another time.
  • Figure 4: Improved performance of BRB. Relative change in average accuracy against the unmodified DCN algorithm for BRB and important ablations with pre-training: DCN+Reset refers to performing only weight resets (Eq. \ref{['eq:w_reset_shrink_perturb_orig']}); DCN+Recluster, refers to only reclustering. Our DCN+BRB with both weight resets and reclustering consistently improves performance for DCN.
  • Figure 5: BRB enables training from scratch. Relative improvements of clustering accuracy for DEC and IDEC when using BRB without pre-training.
  • ...and 14 more figures