Table of Contents
Fetching ...

Equitable Federated Learning with Activation Clustering

Antesh Upadhyay, Abolfazl Hashemi

TL;DR

This work proposes an equitable clustering-based framework where the clients are categorized/clustered based on how similar they are to each other based on how similar they are to each other, and proposes a unique way to construct the similarity matrix that uses activation vectors.

Abstract

Federated learning is a prominent distributed learning paradigm that incorporates collaboration among diverse clients, promotes data locality, and thus ensures privacy. These clients have their own technological, cultural, and other biases in the process of data generation. However, the present standard often ignores this bias/heterogeneity, perpetuating bias against certain groups rather than mitigating it. In response to this concern, we propose an equitable clustering-based framework where the clients are categorized/clustered based on how similar they are to each other. We propose a unique way to construct the similarity matrix that uses activation vectors. Furthermore, we propose a client weighing mechanism to ensure that each cluster receives equal importance and establish $O(1/\sqrt{K})$ rate of convergence to reach an $ε-$stationary solution. We assess the effectiveness of our proposed strategy against common baselines, demonstrating its efficacy in terms of reducing the bias existing amongst various client clusters and consequently ameliorating algorithmic bias against specific groups.

Equitable Federated Learning with Activation Clustering

TL;DR

This work proposes an equitable clustering-based framework where the clients are categorized/clustered based on how similar they are to each other based on how similar they are to each other, and proposes a unique way to construct the similarity matrix that uses activation vectors.

Abstract

Federated learning is a prominent distributed learning paradigm that incorporates collaboration among diverse clients, promotes data locality, and thus ensures privacy. These clients have their own technological, cultural, and other biases in the process of data generation. However, the present standard often ignores this bias/heterogeneity, perpetuating bias against certain groups rather than mitigating it. In response to this concern, we propose an equitable clustering-based framework where the clients are categorized/clustered based on how similar they are to each other. We propose a unique way to construct the similarity matrix that uses activation vectors. Furthermore, we propose a client weighing mechanism to ensure that each cluster receives equal importance and establish rate of convergence to reach an stationary solution. We assess the effectiveness of our proposed strategy against common baselines, demonstrating its efficacy in terms of reducing the bias existing amongst various client clusters and consequently ameliorating algorithmic bias against specific groups.

Paper Structure

This paper contains 16 sections, 6 theorems, 60 equations, 6 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

Suppose Assumptions as1, as-may15, as-sfo, and as-clustering hold true for Equitable-FL (refer alg:equitable-fl). In Equitable-FL set $\eta =\frac{1}{4E\sqrt{3LK}}$. Define a distribution $\mathbb{P}$ for $k \in \{0,\ldots,K-1\}$ such that $\mathbb{P}(k) = \frac{(1+\zeta-1)^{(K-1-k)}}{\sum_{k=0}^{K- and the expectation is with respect to the randomness in all stochastic gradients and the random se

Figures (6)

  • Figure 1: Activation vectors
  • Figure 2: NMI comparison across various vision datasets using ResNet-18 model architectures. We partitioned the data among clients to form clusters. The $C$ values in each figure indicate the actual number of clusters we divided the clients into for each dataset. Our observations reveal that the NMIs are close to 1, suggesting that the algorithm's performance aligns closely with the true cluster labels of the clients.
  • Figure 3: Client disagreement comparison on different vision datasets using two different model architectures. The plots in the first row are generated using a simple CNN model architecture, and the plots in the second row are generated using the ResNet-18 model architecture. Equitable-FL consistently outperforms other baselines, invariant to the model architecture, datasets, and number of clusters.
  • Figure 4: $\mathbf{\sigma_{Acc}}$ comparison on different vision datasets using two different model architectures. The plots in the first row are generated using simple CNN model architecture, and the plots in the second row are generated using the ResNet-18 model architecture. Equitable-FL consistently outperforms other baselines, invariant to the model architecture, datasets, and number of clusters.
  • Figure 5: Test accuracy comparison on different vision datasets using two different model architectures. The plots in the first row are generated using the model architecture described in \ref{['subsec:model_data']}, and the plots in the second row are generated using the ResNet-18 model architecture. Equitable-FL consistently outperforms other baselines, invariant to the model architecture, datasets, and number of clusters.
  • ...and 1 more figures

Theorems & Definitions (17)

  • Definition 1: Activation vector
  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4
  • Theorem 1: Smooth non-convex case of Equitable-FL
  • proof
  • lemma 1
  • proof
  • lemma 2
  • ...and 7 more