Generalized and Personalized Federated Learning with Foundation Models via Orthogonal Transformations
Eun Gyung Kong, Je Won Yeom, Yonghoon Jeon, Taesup Kim
TL;DR
This work tackles the challenge of balancing generalization and personalization in federated learning under data heterogeneity by leveraging foundation models in a black-box setting. It introduces FedOT, which keeps a fixed vision encoder, learns a globally shared classifier, and uses per-client orthogonal feature transforms implemented via the Cayley transform to enable local adaptation without tampering with the encoder. Theoretical analysis shows that orthogonal transforms bound gradient differences across clients, with a tight upper bound of $4\tau$ when the local transforms have condition number $\kappa=1$, and block-diagonal variants offer a controllable trade-off between expressivity and complexity. Empirically, FedOT and its block-diagonal extension FedOT(+B) outperform baselines across five domain-shift datasets, delivering strong generalization, personalized performance, and robust behavior under varying communication rounds, while preserving data privacy and foundation-model IP. The approach highlights practical implications for deploying secure, scalable, and adaptable FL systems that capitalize on large foundation models.
Abstract
Federated Learning (FL) aims to train models across decentralized clients or devices holding local data without the need for centralized data collection, thus enhancing data privacy and security. However, achieving both generalization and personalization in heterogeneous settings remains a significant challenge. To address this, we introduce FedOT, a novel approach that leverages black-box foundation models. FedOT shares only a global task-dependent classifier across clients while locally adapting features through orthogonal transformations. By enforcing orthogonality, FedOT mitigates gradient conflicts across diverse clients, preserves semantic integrity, and achieves robust performance even in the presence of substantial data heterogeneity. The strategy of combining global and local parameters enables a more balanced approach for both generalization and personalization, outperforming baseline FL methods across multiple benchmarks. Furthermore, our extensive analysis confirms that joint optimization of global classifiers and local orthogonal transformations yields superior performance and suggests broader applicability.
