Table of Contents
Fetching ...

Hard Regularization to Prevent Deep Online Clustering Collapse without Data Augmentation

Louis Mahon, Thomas Lukasiewicz

TL;DR

This work tackles collapse in online deep clustering without data augmentation by introducing a Bayesian hard-assignment framework. It derives a greedy combination-assignment objective that balances centroid proximity, prior cluster probabilities, and batch-size penalties, linking to mutual information maximization between batch indices and cluster labels. Empirically, the data-augmentation-free method outperforms existing partition-support approaches across multiple datasets, yielding stronger clustering and more informative representations. The results demonstrate that regularizing hard assignments is more effective than soft-assignment strategies for preventing collapse in fully online clustering scenarios.

Abstract

Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed. While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster. Successful existing models have employed various techniques to avoid this problem, most of which require data augmentation or which aim to make the average soft assignment across the dataset the same for each cluster. We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments. Using a Bayesian framework, we derive an intuitive optimization objective that can be straightforwardly included in the training of the encoder network. Tested on four image datasets and one human-activity recognition dataset, it consistently avoids collapse more robustly than other methods and leads to more accurate clustering. We also conduct further experiments and analyses justifying our choice to regularize the hard cluster assignments. Code is available at https://github.com/Lou1sM/online_hard_clustering.

Hard Regularization to Prevent Deep Online Clustering Collapse without Data Augmentation

TL;DR

This work tackles collapse in online deep clustering without data augmentation by introducing a Bayesian hard-assignment framework. It derives a greedy combination-assignment objective that balances centroid proximity, prior cluster probabilities, and batch-size penalties, linking to mutual information maximization between batch indices and cluster labels. Empirically, the data-augmentation-free method outperforms existing partition-support approaches across multiple datasets, yielding stronger clustering and more informative representations. The results demonstrate that regularizing hard assignments is more effective than soft-assignment strategies for preventing collapse in fully online clustering scenarios.

Abstract

Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed. While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster. Successful existing models have employed various techniques to avoid this problem, most of which require data augmentation or which aim to make the average soft assignment across the dataset the same for each cluster. We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments. Using a Bayesian framework, we derive an intuitive optimization objective that can be straightforwardly included in the training of the encoder network. Tested on four image datasets and one human-activity recognition dataset, it consistently avoids collapse more robustly than other methods and leads to more accurate clustering. We also conduct further experiments and analyses justifying our choice to regularize the hard cluster assignments. Code is available at https://github.com/Lou1sM/online_hard_clustering.
Paper Structure (25 sections, 3 theorems, 36 equations, 1 figure, 6 tables, 1 algorithm)

This paper contains 25 sections, 3 theorems, 36 equations, 1 figure, 6 tables, 1 algorithm.

Key Result

Lemma A.1

Let $X \in \mathbb{R}^{N \times K}$ be the matrix of already-made assignments in the current batch, and let $H^{(k)}$ be the marginal entropy after the new hard assignment is made to cluster $k$. Then

Figures (1)

  • Figure 1: Top: with no regularization, the model collapses. Middle: soft regularization encourages uniform soft assignments but the argmax is constant. Bottom: hard regularization causes the argmax to change, and avoids collapse.

Theorems & Definitions (7)

  • Lemma A.1
  • proof
  • Lemma A.2
  • proof
  • Remark A.3
  • Theorem A.4
  • proof