Table of Contents
Fetching ...

Doubly Stochastic Mean-Shift Clustering

Tom Trigano, Yann Sepulcre, Itshak Lapidot

TL;DR

Doubly Stochastic Mean-Shift is proposed, a novel extension that introduces randomness not only in the trajectory updates but also in the kernel bandwidth itself, and shows that this randomized bandwidth policy acts as an implicit regularization mechanism, and provides convergence theoretical results.

Abstract

Standard Mean-Shift algorithms are notoriously sensitive to the bandwidth hyperparameter, particularly in data-scarce regimes where fixed-scale density estimation leads to fragmentation and spurious modes. In this paper, we propose Doubly Stochastic Mean-Shift (DSMS), a novel extension that introduces randomness not only in the trajectory updates but also in the kernel bandwidth itself. By drawing both the data samples and the radius from a continuous uniform distribution at each iteration, DSMS effectively performs a better exploration of the density landscape. We show that this randomized bandwidth policy acts as an implicit regularization mechanism, and provide convergence theoretical results. Comparative experiments on synthetic Gaussian mixtures reveal that DSMS significantly outperforms standard and stochastic Mean-Shift baselines, exhibiting remarkable stability and preventing over-segmentation in sparse clustering scenarios without other performance degradation.

Doubly Stochastic Mean-Shift Clustering

TL;DR

Doubly Stochastic Mean-Shift is proposed, a novel extension that introduces randomness not only in the trajectory updates but also in the kernel bandwidth itself, and shows that this randomized bandwidth policy acts as an implicit regularization mechanism, and provides convergence theoretical results.

Abstract

Standard Mean-Shift algorithms are notoriously sensitive to the bandwidth hyperparameter, particularly in data-scarce regimes where fixed-scale density estimation leads to fragmentation and spurious modes. In this paper, we propose Doubly Stochastic Mean-Shift (DSMS), a novel extension that introduces randomness not only in the trajectory updates but also in the kernel bandwidth itself. By drawing both the data samples and the radius from a continuous uniform distribution at each iteration, DSMS effectively performs a better exploration of the density landscape. We show that this randomized bandwidth policy acts as an implicit regularization mechanism, and provide convergence theoretical results. Comparative experiments on synthetic Gaussian mixtures reveal that DSMS significantly outperforms standard and stochastic Mean-Shift baselines, exhibiting remarkable stability and preventing over-segmentation in sparse clustering scenarios without other performance degradation.
Paper Structure (19 sections, 4 theorems, 29 equations, 5 figures, 4 tables)

This paper contains 19 sections, 4 theorems, 29 equations, 5 figures, 4 tables.

Key Result

Proposition 1

Suppose the SMS with (fixed) BW $h$ updates $\mathcal{X}^{(k)}$ to $\mathcal{X}^{(k+1)}$, it holds that with $i_{k} \in [n]$ denoting the randomly chosen index at step $k$. We have also the following upper bound on the gradient component along the direction $\mathbf{x}_{i_{k}}$:

Figures (5)

  • Figure 1: Average number of clusters with 90% confidence intervals for the different algorithms, as a function of the number of samples per cluster. The true number of clusters (3) is represented by the thin dotted line.
  • Figure 2: $K$ as a function of the class imbalance ratio as defined in lapidot_stochastic_2025
  • Figure 3: $K$ as a function of the number of clusters
  • Figure 4: $K$ as a function of the bandwidth range $h_{max}-h_{min}$
  • Figure 5: ACP and ALP as a function of $h_{max}-h_{min}$

Theorems & Definitions (4)

  • Proposition 1
  • Proposition 2
  • Corollary 1
  • Proposition 3