Enhanced Federated Deep Multi-View Clustering under Uncertainty Scenario
Bingjun Wei, Xuemei Cao, Jiafen Liu, Haoyang Liang, Xin Yang
TL;DR
This work tackles uncertainty in federated deep multi-view clustering caused by heterogeneous and incomplete client views. It introduces EFDMVC, a framework that jointly aligns local semantics, performs hierarchical cross-view contrastive fusion, compensates model drift, and balances cross-client aggregation to stabilize learning under arbitrary view subsets. Key contributions include consensus pre-training for feature alignment, multi-level contrastive fusion across full/partial/single views, a drift compensation term, and an uncertainty-aware aggregation scheme, all evaluated under IID and Non-IID regimes. Experiments on six datasets show robust gains over state-of-the-art baselines, indicating EFDMVC's potential for privacy-preserving, scalable clustering in distributed real-world settings.
Abstract
Traditional Federated Multi-View Clustering assumes uniform views across clients, yet practical deployments reveal heterogeneous view completeness with prevalent incomplete, redundant, or corrupted data. While recent approaches model view heterogeneity, they neglect semantic conflicts from dynamic view combinations, failing to address dual uncertainties: view uncertainty (semantic inconsistency from arbitrary view pairings) and aggregation uncertainty (divergent client updates with imbalanced contributions). To address these, we propose a novel Enhanced Federated Deep Multi-View Clustering framework: first align local semantics, hierarchical contrastive fusion within clients resolves view uncertainty by eliminating semantic conflicts; a view adaptive drift module mitigates aggregation uncertainty through global-local prototype contrast that dynamically corrects parameter deviations; and a balanced aggregation mechanism coordinates client updates. Experimental results demonstrate that EFDMVC achieves superior robustness against heterogeneous uncertain views across multiple benchmark datasets, consistently outperforming all state-of-the-art baselines in comprehensive evaluations.
