Doubly Stochastic Mean-Shift Clustering

Tom Trigano; Yann Sepulcre; Itshak Lapidot

Doubly Stochastic Mean-Shift Clustering

Tom Trigano, Yann Sepulcre, Itshak Lapidot

TL;DR

Doubly Stochastic Mean-Shift is proposed, a novel extension that introduces randomness not only in the trajectory updates but also in the kernel bandwidth itself, and shows that this randomized bandwidth policy acts as an implicit regularization mechanism, and provides convergence theoretical results.

Abstract

Standard Mean-Shift algorithms are notoriously sensitive to the bandwidth hyperparameter, particularly in data-scarce regimes where fixed-scale density estimation leads to fragmentation and spurious modes. In this paper, we propose Doubly Stochastic Mean-Shift (DSMS), a novel extension that introduces randomness not only in the trajectory updates but also in the kernel bandwidth itself. By drawing both the data samples and the radius from a continuous uniform distribution at each iteration, DSMS effectively performs a better exploration of the density landscape. We show that this randomized bandwidth policy acts as an implicit regularization mechanism, and provide convergence theoretical results. Comparative experiments on synthetic Gaussian mixtures reveal that DSMS significantly outperforms standard and stochastic Mean-Shift baselines, exhibiting remarkable stability and preventing over-segmentation in sparse clustering scenarios without other performance degradation.

Doubly Stochastic Mean-Shift Clustering

TL;DR

Abstract

Paper Structure (19 sections, 4 theorems, 29 equations, 5 figures, 4 tables)

This paper contains 19 sections, 4 theorems, 29 equations, 5 figures, 4 tables.

Introduction
Doubly Stochastic mean-shift
Blurring mean-shift and Stochastic mean-shift
Doubly stochastic mean-shift
Theoretical results
Submartingale property of Doubly Stochastic Mean Shift
Clustering and convergence results for DSMS
Practical Convergence Diagnosis
Numerical Experiments
Performance on Underrepresented Clusters
Exhaustive Comparison with SMS
Influence of the bandwidth range
Conclusion
Proof of Proposition \ref{['prop:non_decreasing_L']}
Proof of Proposition \ref{['prop:submartingale_for_Lh']}
...and 4 more sections

Key Result

Proposition 1

Suppose the SMS with (fixed) BW $h$ updates $\mathcal{X}^{(k)}$ to $\mathcal{X}^{(k+1)}$, it holds that with $i_{k} \in [n]$ denoting the randomly chosen index at step $k$. We have also the following upper bound on the gradient component along the direction $\mathbf{x}_{i_{k}}$:

Figures (5)

Figure 1: Average number of clusters with 90% confidence intervals for the different algorithms, as a function of the number of samples per cluster. The true number of clusters (3) is represented by the thin dotted line.
Figure 2: $K$ as a function of the class imbalance ratio as defined in lapidot_stochastic_2025
Figure 3: $K$ as a function of the number of clusters
Figure 4: $K$ as a function of the bandwidth range $h_{max}-h_{min}$
Figure 5: ACP and ALP as a function of $h_{max}-h_{min}$

Theorems & Definitions (4)

Proposition 1
Proposition 2
Corollary 1
Proposition 3

Doubly Stochastic Mean-Shift Clustering

TL;DR

Abstract

Doubly Stochastic Mean-Shift Clustering

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (4)