Federated Incomplete Multi-View Clustering with Heterogeneous Graph Neural Networks

Xueming Yan; Ziqi Wang; Yaochu Jin

Federated Incomplete Multi-View Clustering with Heterogeneous Graph Neural Networks

Xueming Yan, Ziqi Wang, Yaochu Jin

TL;DR

Federated incomplete multi-view clustering under data heterogeneity and privacy constraints is addressed by FIM-GNNs, which deploy heterogeneous GNNs (GCN/GAT) at each client to extract view-specific features and use a server-side aggregation of overlapping samples to form a global representation. The model optimizes a joint loss $L = L_r + \gamma L_c$, where $L_r$ is a reconstruction loss from a graph autoencoder and $L_c$ is a KL-based clustering loss against globally updated pseudo-labels $P$. A global pseudo-label mechanism, coupled with weighted aggregation across heterogeneous views and Hungarian alignment, enables consistent clustering across incomplete views. Empirical results on Caltech-7 and BDGP demonstrate competitive or superior performance relative to state-of-the-art incomplete MVC methods, validating the approach under privacy-preserving federated settings and incomplete data.

Abstract

Federated multi-view clustering offers the potential to develop a global clustering model using data distributed across multiple devices. However, current methods face challenges due to the absence of label information and the paramount importance of data privacy. A significant issue is the feature heterogeneity across multi-view data, which complicates the effective mining of complementary clustering information. Additionally, the inherent incompleteness of multi-view data in a distributed setting can further complicate the clustering process. To address these challenges, we introduce a federated incomplete multi-view clustering framework with heterogeneous graph neural networks (FIM-GNNs). In the proposed FIM-GNNs, autoencoders built on heterogeneous graph neural network models are employed for feature extraction of multi-view data at each client site. At the server level, heterogeneous features from overlapping samples of each client are aggregated into a global feature representation. Global pseudo-labels are generated at the server to enhance the handling of incomplete view data, where these labels serve as a guide for integrating and refining the clustering process across different data views. Comprehensive experiments have been conducted on public benchmark datasets to verify the performance of the proposed FIM-GNNs in comparison with state-of-the-art algorithms.

Federated Incomplete Multi-View Clustering with Heterogeneous Graph Neural Networks

TL;DR

, where

is a reconstruction loss from a graph autoencoder and

is a KL-based clustering loss against globally updated pseudo-labels

. A global pseudo-label mechanism, coupled with weighted aggregation across heterogeneous views and Hungarian alignment, enables consistent clustering across incomplete views. Empirical results on Caltech-7 and BDGP demonstrate competitive or superior performance relative to state-of-the-art incomplete MVC methods, validating the approach under privacy-preserving federated settings and incomplete data.

Abstract

Paper Structure (15 sections, 18 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 15 sections, 18 equations, 6 figures, 3 tables, 1 algorithm.

Introduction
Related work
Problem formulation
Multi-view clustering
Proposed Method
Local training with heterogeneous GNNs
Global aggregation
Algorithm optimization
Complexity Analysis
Experiments
Experimental Setup
Performance Evaluation
Effect of heterogeneous GNNs
Parameter Sensitivity
Conclusion

Figures (6)

Figure 1: The disparity between complete and incomplete multi-view federated data. (a) Federated complete multi-view clustering, where different clients possess complete, distinct sets of sample features. (b) Federated incomplete multi-view clustering, where different clients possess different sets of sample features. Each client may have missing samples, but each sample exists in at least one client.
Figure 2: Overview of FIM-GNNs. There are $m$ clients and one server. Initially, the clients perform feature extraction using GAT or GCN based on local features. We employ a decoder for graph reconstruction and the global pseudo-label $P$ as auxiliary training information.
Figure 3: The $t$-SNE visualization results for different communication epochs on the BDGP dataset.
Figure 4: Experimental results of the FIM-GNNs with GAT, GCN, and a combination of GCN and GAT, respectively.
Figure 5: Accuracy results were obtained by using different $\beta$ values on the Caltech-7 and BDGP datasets.
...and 1 more figures

Federated Incomplete Multi-View Clustering with Heterogeneous Graph Neural Networks

TL;DR

Abstract

Federated Incomplete Multi-View Clustering with Heterogeneous Graph Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (6)