Table of Contents
Fetching ...

Federated Deep Subspace Clustering

Yupei Zhang, Ruojia Feng, Yifei Wang, Xuequn Shang

TL;DR

This work tackles privacy-preserving clustering on distributed data by introducing Federated Deep Subspace Clustering (FDSC), which trains a shared encoder across clients while keeping private decoders and self-expressive layers. The global encoder $E_*$ is obtained through weighted federated averaging of local encoders, and a graph-based regularizer aligns the local self-expression $R_i$ with the adjacency $A_i$ to preserve subspace structure. Empirical results on MNIST, ORL, COIL20, and COIL100 show FDSC outperforms state-of-the-art baselines (LRSC, DLRSC, DSCN) in clustering metrics, with adjacency-regularized variants offering further gains. The approach enables effective, scalable deep clustering in privacy-sensitive, distributed settings, with potential extensions to larger models and more complex data modalities.

Abstract

This paper introduces FDSC, a private-protected subspace clustering (SC) approach with federated learning (FC) schema. In each client, there is a deep subspace clustering network accounting for grouping the isolated data, composed of a encode network, a self-expressive layer, and a decode network. FDSC is achieved by uploading the encode network to communicate with other clients in the server. Besides, FDSC is also enhanced by preserving the local neighborhood relationship in each client. With the effects of federated learning and locality preservation, the learned data features from the encoder are boosted so as to enhance the self-expressiveness learning and result in better clustering performance. Experiments test FDSC on public datasets and compare with other clustering methods, demonstrating the effectiveness of FDSC.

Federated Deep Subspace Clustering

TL;DR

This work tackles privacy-preserving clustering on distributed data by introducing Federated Deep Subspace Clustering (FDSC), which trains a shared encoder across clients while keeping private decoders and self-expressive layers. The global encoder is obtained through weighted federated averaging of local encoders, and a graph-based regularizer aligns the local self-expression with the adjacency to preserve subspace structure. Empirical results on MNIST, ORL, COIL20, and COIL100 show FDSC outperforms state-of-the-art baselines (LRSC, DLRSC, DSCN) in clustering metrics, with adjacency-regularized variants offering further gains. The approach enables effective, scalable deep clustering in privacy-sensitive, distributed settings, with potential extensions to larger models and more complex data modalities.

Abstract

This paper introduces FDSC, a private-protected subspace clustering (SC) approach with federated learning (FC) schema. In each client, there is a deep subspace clustering network accounting for grouping the isolated data, composed of a encode network, a self-expressive layer, and a decode network. FDSC is achieved by uploading the encode network to communicate with other clients in the server. Besides, FDSC is also enhanced by preserving the local neighborhood relationship in each client. With the effects of federated learning and locality preservation, the learned data features from the encoder are boosted so as to enhance the self-expressiveness learning and result in better clustering performance. Experiments test FDSC on public datasets and compare with other clustering methods, demonstrating the effectiveness of FDSC.
Paper Structure (19 sections, 9 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 19 sections, 9 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: FDSC framework. Each client contains shared encoder, private self-expressive layer and private decoder. The server calculates the weighted average for the encoders.
  • Figure 2: The samples of image datasets MNIST, ORL, COIL20 and COIL100.
  • Figure 3: Scatterplot of 2D representation of DSCN at the first row and FDSC at the second row. Each column represents a separate representation of the same data on the same client. Images of different colors represent different categories of data.
  • Figure 4: Visualization of self-expression matrices of DSCN at the first column and FDSC at the second column. The first row is the training result on MNIST and the second row is the training result on COIL20. Colors indicate the data in the different classes.