Table of Contents
Fetching ...

A Unified Framework for Center-based Clustering of Distributed Data

Aleksandar Armacki, Dragana Bajović, Dušan Jakovetić, Soummya Kar

TL;DR

It is proved that consensus fixed points of DGC-$\mathcal{F}_{\rho}$ are equivalent to fixed points of gradient clustering over the full data, guaranteeing a clustering of the full data is produced.

Abstract

We develop a family of distributed center-based clustering algorithms that work over networks of users. In the proposed scenario, users contain a local dataset and communicate only with their immediate neighbours, with the aim of finding a clustering of the full, joint data. The proposed family, termed Distributed Gradient Clustering (DGC-$\mathcal{F}_ρ$), is parametrized by $ρ\geq 1$, controling the proximity of users' center estimates, with $\mathcal{F}$ determining the clustering loss. Our framework allows for a broad class of smooth convex loss functions, including popular clustering losses like $K$-means and Huber loss. Specialized to popular clustering losses like $K$-means and Huber loss, DGC-$\mathcal{F}_ρ$ gives rise to novel distributed clustering algorithms DGC-KM$_ρ$ and DGC-HL$_ρ$, while novel clustering losses based on Logistic and Fair functions lead to DGC-LL$_ρ$ and DGC-FL$_ρ$. We provide a unified analysis and establish several strong results, under mild assumptions. First, we show that the sequence of centers generated by the methods converges to a well-defined notion of fixed point, under any center initialization and value of $ρ$. Second, we prove that, as $ρ$ increases, the family of fixed points produced by DGC-$\mathcal{F}_ρ$ converges to a notion of consensus fixed points. We show that consensus fixed points of DGC-$\mathcal{F}_ρ$ are equivalent to fixed points of gradient clustering over the full data, guaranteeing a clustering of the full data is produced. For the special case of Bregman losses, we show that our fixed points converge to the set of Lloyd points. Extensive numerical experiments on synthetic and real data confirm our theoretical findings, show strong performance of our methods and demonstrate the usefulness and wide range of potential applications of our general framework, such as outlier detection.

A Unified Framework for Center-based Clustering of Distributed Data

TL;DR

It is proved that consensus fixed points of DGC- are equivalent to fixed points of gradient clustering over the full data, guaranteeing a clustering of the full data is produced.

Abstract

We develop a family of distributed center-based clustering algorithms that work over networks of users. In the proposed scenario, users contain a local dataset and communicate only with their immediate neighbours, with the aim of finding a clustering of the full, joint data. The proposed family, termed Distributed Gradient Clustering (DGC-), is parametrized by , controling the proximity of users' center estimates, with determining the clustering loss. Our framework allows for a broad class of smooth convex loss functions, including popular clustering losses like -means and Huber loss. Specialized to popular clustering losses like -means and Huber loss, DGC- gives rise to novel distributed clustering algorithms DGC-KM and DGC-HL, while novel clustering losses based on Logistic and Fair functions lead to DGC-LL and DGC-FL. We provide a unified analysis and establish several strong results, under mild assumptions. First, we show that the sequence of centers generated by the methods converges to a well-defined notion of fixed point, under any center initialization and value of . Second, we prove that, as increases, the family of fixed points produced by DGC- converges to a notion of consensus fixed points. We show that consensus fixed points of DGC- are equivalent to fixed points of gradient clustering over the full data, guaranteeing a clustering of the full data is produced. For the special case of Bregman losses, we show that our fixed points converge to the set of Lloyd points. Extensive numerical experiments on synthetic and real data confirm our theoretical findings, show strong performance of our methods and demonstrate the usefulness and wide range of potential applications of our general framework, such as outlier detection.
Paper Structure (24 sections, 15 theorems, 71 equations, 9 figures, 7 tables, 1 algorithm)

This paper contains 24 sections, 15 theorems, 71 equations, 9 figures, 7 tables, 1 algorithm.

Key Result

Theorem 1

Let Assumptions asmpt:data-asmpt:g&f hold. For the step-size $\alpha < 1/(\beta/\rho + \lambda_{\max}(L))$, any initialization $\mathbf{x}^0 \in \mathbb{R}^{Kmd}$ and $\rho \geq 1$, the sequence of centers $\{\mathbf{x}^t\}_{t \in \mathbb{N}}$ generated by DGC-$\mathcal{F}_\rho$ converges to a fixed

Figures (9)

  • Figure 1: An example of client-server and peer-to-peer setups in distributed learning. Vertices represents users and edges represent bidirectional communication links.
  • Figure 2: Behaviour of $J_\rho$ for different $\rho$ and $B = 1$. Left to right: DGC-KM$_\rho$, DGC-HL$_\rho$ upper and DGC-LL$_\rho$, DGC-FL$_\rho$ lower row.
  • Figure 3: Performance of DGC-KM, DGC-HL and ADMM-KM on noisy Iris data. We can see that DGC-HL successfully identifies the true clusters, while both DGC-KM and ADMM-KM incorrectly identify the cluster of outliers as one of the true clusters.
  • Figure 4: Data distribution across users in the heterogeneous data setup. The $x$ axis shows the number of users, with $y$ axis showing the number of data points per user. The bars show the classes and proportion of samples per class available at each user. Left to right: Iris, MNIST7 and CIFAR3 datasets.
  • Figure 5: Ten users communicating over a ring graph. Unless specified otherwise, these are the default number of users and communication topology used in our experiments.
  • ...and 4 more figures

Theorems & Definitions (54)

  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4
  • Remark 5
  • Remark 6
  • Remark 7
  • Remark 8
  • Remark 9
  • Remark 10
  • ...and 44 more