Table of Contents
Fetching ...

Stochastic Mean-Shift Clustering

Itshak Lapidot, Yann Sepulcre, Tom Trigano

TL;DR

This paper introduces Stochastic Mean Shift (SMS), a stochastic, asynchronous variant of mean-shift clustering in which a randomly chosen data point is updated at each iteration via a mean-shift gradient step. The authors establish theoretical properties, including a non-decreasing KDE objective $L$ and eventual clustering with vanishing diameters, and provide practical convergence criteria. Empirically, SMS demonstrates competitive or superior performance to deterministic MS and often matches or beats Blurring Mean Shift (BMS) on synthetic multi-modal data, while offering linear per-update complexity and better scalability to large datasets. The approach is validated on speaker clustering tasks using PLDA-based distances, illustrating practical impact for diarization and other high-dimensional clustering problems. Overall, SMS offers a robust, scalable alternative to classical mean-shift methods with strong convergence behavior and broad applicability.

Abstract

We present a stochastic version of the mean-shift clustering algorithm. In this stochastic version a randomly chosen sequence of data points move according to partial gradient ascent steps of the objective function. Theoretical results illustrating the convergence of the proposed approach, and its relative performances is evaluated on synthesized 2-dimensional samples generated by a Gaussian mixture distribution and compared with state-of-the-art methods. It can be observed that in most cases the stochastic mean-shift clustering outperforms the standard mean-shift. We also illustrate as a practical application the use of the presented method for speaker clustering.

Stochastic Mean-Shift Clustering

TL;DR

This paper introduces Stochastic Mean Shift (SMS), a stochastic, asynchronous variant of mean-shift clustering in which a randomly chosen data point is updated at each iteration via a mean-shift gradient step. The authors establish theoretical properties, including a non-decreasing KDE objective and eventual clustering with vanishing diameters, and provide practical convergence criteria. Empirically, SMS demonstrates competitive or superior performance to deterministic MS and often matches or beats Blurring Mean Shift (BMS) on synthetic multi-modal data, while offering linear per-update complexity and better scalability to large datasets. The approach is validated on speaker clustering tasks using PLDA-based distances, illustrating practical impact for diarization and other high-dimensional clustering problems. Overall, SMS offers a robust, scalable alternative to classical mean-shift methods with strong convergence behavior and broad applicability.

Abstract

We present a stochastic version of the mean-shift clustering algorithm. In this stochastic version a randomly chosen sequence of data points move according to partial gradient ascent steps of the objective function. Theoretical results illustrating the convergence of the proposed approach, and its relative performances is evaluated on synthesized 2-dimensional samples generated by a Gaussian mixture distribution and compared with state-of-the-art methods. It can be observed that in most cases the stochastic mean-shift clustering outperforms the standard mean-shift. We also illustrate as a practical application the use of the presented method for speaker clustering.

Paper Structure

This paper contains 18 sections, 2 theorems, 23 equations, 6 figures, 6 tables.

Key Result

Proposition 1

Given any initial state $\mathcal{X}^{(0)} \in \mathbb{R}^{d n}$, for any $k \in \mathbb{N}$ it holds that for some constant $C>0$, with $i_{k} \in [n]$ denoting the randomly chosen index at step $k$. Two consequences follow from eq:non_decreasing_L:

Figures (6)

  • Figure 1: Results of experiment 2 (a) data with true labels; (b) unlabeled data; (c) converging paths of MS ; (d) converging paths of SMS ; (e) the found modes of MS ; (f) the found modes of SMS ; (g) the clustering of MS ; (f) the found clustering of SMS .
  • Figure 2: Results of Set $4$ (a) data with true labels; (b) unlabeled data; (c) converging paths of the deterministic mean-shift; (d) converging paths of the stochastic mean-shift; (e) the modes found of the deterministic mean-shift; (f) the modes found of the stochastic mean-shift; (g) the clustering of the deterministic mean-shift; (f) the found clustering of the stochastic mean-shift.
  • Figure 3: Execution times for MS (blue), BMS (red), and SMS (cyan), as a function of the number of points per cluster. Dotted lines illustrate linear complexity (cyan) and quadratic complexity (red) for visual comparison.
  • Figure 4: Influence of the class imbalance on clustering results for MS (blue), BMS (red) and SMS (cyan).
  • Figure 5: Influence of the data dimension on clustering results for MS (blue), BMS (red) and SMS (cyan).
  • ...and 1 more figures

Theorems & Definitions (2)

  • Proposition 1
  • Proposition 2