FedSDWC: Federated Synergistic Dual-Representation Weak Causal Learning for OOD
Zhenyuan Huang, Hui Zhang, Wenzhong Tang, Haijun Yang
TL;DR
This work tackles federated learning under non-IID distributions and OOD scenarios by introducing FedSDWC, a weak causal learning framework that fuses invariant and variant features through a shallow causal link from variant to invariant factors. The method relies on ELBO-based variational learning, an intervention-based consistency loss, and a hybrid architecture with Fourier augmentation and Gaussian Mixture latent inference, all aggregated via FedAvg. The authors derive a generalization bound linking FL performance to client priors and demonstrate state-of-the-art results on CIFAR-10/100 and TinyImageNet for both OOD generalization and OOD detection, including robustness under covariate shifts and semantic shifts. Overall, FedSDWC provides a principled, scalable approach to robust FL with theoretical guarantees and practical impact for real-world privacy-preserving learning tasks.
Abstract
Amid growing demands for data privacy and advances in computational infrastructure, federated learning (FL) has emerged as a prominent distributed learning paradigm. Nevertheless, differences in data distribution (such as covariate and semantic shifts) severely affect its reliability in real-world deployments. To address this issue, we propose FedSDWC, a causal inference method that integrates both invariant and variant features. FedSDWC infers causal semantic representations by modeling the weak causal influence between invariant and variant features, effectively overcoming the limitations of existing invariant learning methods in accurately capturing invariant features and directly constructing causal representations. This approach significantly enhances FL's ability to generalize and detect OOD data. Theoretically, we derive FedSDWC's generalization error bound under specific conditions and, for the first time, establish its relationship with client prior distributions. Moreover, extensive experiments conducted on multiple benchmark datasets validate the superior performance of FedSDWC in handling covariate and semantic shifts. For example, FedSDWC outperforms FedICON, the next best baseline, by an average of 3.04% on CIFAR-10 and 8.11% on CIFAR-100.
