Table of Contents
Fetching ...

Distributed Gradient Clustering: Convergence and the Effect of Initialization

Aleksandar Armacki, Himkant Sharma, Dragana Bajović, Dušan Jakovetić, Mrityunjoy Chakraborty, Soummya Kar

Abstract

We study the effects of center initialization on the performance of a family of distributed gradient-based clustering algorithms introduced in [1], that work over connected networks of users. In the considered scenario, each user contains a local dataset and communicates only with its immediate neighbours, with the aim of finding a global clustering of the joint data. We perform extensive numerical experiments, evaluating the effects of center initialization on the performance of our family of methods, demonstrating that our methods are more resilient to the effects of initialization, compared to centralized gradient clustering [2]. Next, inspired by the $K$-means++ initialization [3], we propose a novel distributed center initialization scheme, which is shown to improve the performance of our methods, compared to the baseline random initialization.

Distributed Gradient Clustering: Convergence and the Effect of Initialization

Abstract

We study the effects of center initialization on the performance of a family of distributed gradient-based clustering algorithms introduced in [1], that work over connected networks of users. In the considered scenario, each user contains a local dataset and communicates only with its immediate neighbours, with the aim of finding a global clustering of the joint data. We perform extensive numerical experiments, evaluating the effects of center initialization on the performance of our family of methods, demonstrating that our methods are more resilient to the effects of initialization, compared to centralized gradient clustering [2]. Next, inspired by the -means++ initialization [3], we propose a novel distributed center initialization scheme, which is shown to improve the performance of our methods, compared to the baseline random initialization.
Paper Structure (6 sections, 1 theorem, 4 equations, 3 figures, 2 algorithms)

This paper contains 6 sections, 1 theorem, 4 equations, 3 figures, 2 algorithms.

Key Result

Theorem 1

Let Assumptions asmpt:data-asmpt:coerc hold. For the step-size $\alpha < (\beta/\rho + \overline{\lambda}(L))^{-1}$, any initialization $\mathbf{x}^0 \in \mathbb{R}^{Kmd}$ and $\rho \geq 1$, the sequence of centers $\{\mathbf{x}^t\}_{t \in \mathbb{N}}$ generated by DGC-$\mathcal{F}_\rho$ converges t

Figures (3)

  • Figure 1: Homogeneous and heterogeneous data distributions across users.
  • Figure 2: Performance of methods on homogeneous and heterogeneous data.
  • Figure 3: Performance of methods on homogeneous and heterogeneous data.

Theorems & Definitions (4)

  • Definition 1
  • Definition 2
  • Definition 3
  • Theorem 1