Table of Contents
Fetching ...

Global Intervention and Distillation for Federated Out-of-Distribution Generalization

Zhuang Qi, Runhui Zhang, Lei Meng, Wei Wu, Yachong Zhang, Xiangxu Meng

TL;DR

The paper tackles federated out-of-distribution generalization under attribute skew, where local models latch onto non-causal background features. It introduces FedGID, a plug-and-play framework with two modules: Global Intervention (GI) to perform backdoor-adjustment by decoupling objects from backgrounds and injecting diverse background information, and Global Distillation (GD) to align local representations with a unified global knowledge base via KL-based regularization. The method optimizes a total loss $\mathcal{L}_{total} = \mathcal{L}_{EM} + \mathcal{L}_{GI} + \lambda \mathcal{L}_{GD}$, encouraging robust, invariant features across clients. Empirical results on three datasets show FedGID improves attention to main subjects in unseen data and outperforms state-of-the-art baselines, with ablations confirming the complementary benefits of GI and GD. Overall, FedGID provides a model-agnostic, privacy-conscious approach to federated OOD generalization by combining backdoor adjustment with cross-client knowledge distillation, enabling more reliable collaborative learning in heterogeneous environments.

Abstract

Attribute skew in federated learning leads local models to focus on learning non-causal associations, guiding them towards inconsistent optimization directions, which inevitably results in performance degradation and unstable convergence. Existing methods typically leverage data augmentation to enhance sample diversity or employ knowledge distillation to learn invariant representations. However, the instability in the quality of generated data and the lack of domain information limit their performance on unseen samples. To address these issues, this paper presents a global intervention and distillation method, termed FedGID, which utilizes diverse attribute features for backdoor adjustment to break the spurious association between background and label. It includes two main modules, where the global intervention module adaptively decouples objects and backgrounds in images, injects background information into random samples to intervene in the sample distribution, which links backgrounds to all categories to prevent the model from treating background-label associations as causal. The global distillation module leverages a unified knowledge base to guide the representation learning of client models, preventing local models from overfitting to client-specific attributes. Experimental results on three datasets demonstrate that FedGID enhances the model's ability to focus on the main subjects in unseen data and outperforms existing methods in collaborative modeling.

Global Intervention and Distillation for Federated Out-of-Distribution Generalization

TL;DR

The paper tackles federated out-of-distribution generalization under attribute skew, where local models latch onto non-causal background features. It introduces FedGID, a plug-and-play framework with two modules: Global Intervention (GI) to perform backdoor-adjustment by decoupling objects from backgrounds and injecting diverse background information, and Global Distillation (GD) to align local representations with a unified global knowledge base via KL-based regularization. The method optimizes a total loss , encouraging robust, invariant features across clients. Empirical results on three datasets show FedGID improves attention to main subjects in unseen data and outperforms state-of-the-art baselines, with ablations confirming the complementary benefits of GI and GD. Overall, FedGID provides a model-agnostic, privacy-conscious approach to federated OOD generalization by combining backdoor adjustment with cross-client knowledge distillation, enabling more reliable collaborative learning in heterogeneous environments.

Abstract

Attribute skew in federated learning leads local models to focus on learning non-causal associations, guiding them towards inconsistent optimization directions, which inevitably results in performance degradation and unstable convergence. Existing methods typically leverage data augmentation to enhance sample diversity or employ knowledge distillation to learn invariant representations. However, the instability in the quality of generated data and the lack of domain information limit their performance on unseen samples. To address these issues, this paper presents a global intervention and distillation method, termed FedGID, which utilizes diverse attribute features for backdoor adjustment to break the spurious association between background and label. It includes two main modules, where the global intervention module adaptively decouples objects and backgrounds in images, injects background information into random samples to intervene in the sample distribution, which links backgrounds to all categories to prevent the model from treating background-label associations as causal. The global distillation module leverages a unified knowledge base to guide the representation learning of client models, preventing local models from overfitting to client-specific attributes. Experimental results on three datasets demonstrate that FedGID enhances the model's ability to focus on the main subjects in unseen data and outperforms existing methods in collaborative modeling.

Paper Structure

This paper contains 21 sections, 10 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Motivation of the proposed FedGID. It employs background masking to remove associations between target categories and background attributes without sharing information across clients. Additionally, it uses global knowledge distillation to align feature distributions among clients, enhancing the model's generalization on biased data.
  • Figure 2: Illustration of the structural causal graph with (b) and without (a) intervention. It eliminates the association between the background (B) and the label (Y), enabling the model to establish the connection between the image (X) and the label (Y) by focusing on the main object (O).
  • Figure 3: Illustration of the framework of FedGID. It consists of two main modules, including the global intervention module and the global distillation module. The former performs backdoor adjustment to intervene in the attribute distribution by fusing background information. The latter employs the global knowledge to build unified feature space across clients.
  • Figure 4: Visualization of the visual Attention. (a) The GI module corrects errors in individual clients. (b) The GI module improves the aggregated model by correcting errors in each client, even when both clients make mistakes. (c) The GI module increases the model's confidence in the ground-truth.
  • Figure 5: Visualization of the heterogeneous feature distribution for FedAvg and its version with the GD module. The GD module can mitigate the discrepancies between features from different sources.