Table of Contents
Fetching ...

A Vision-Language Pre-training Model-Guided Approach for Mitigating Backdoor Attacks in Federated Learning

Keke Gai, Dongjue Wang, Jing Yu, Liehuang Zhu, Qi Wu

TL;DR

This work tackles backdoor attacks in Federated Learning under non-IID data by introducing CLIP-Fed, a defense that leverages vision-language pre-training (CLIP) to purify the global model. It combines pre-aggregation model clustering with post-aggregation purification and augments a privacy-preserving server dataset using multimodal language models and frequency-domain perturbations to cover diverse triggers. A semantically guided rectification pathway via prototype alignment and a KL-based knowledge transfer further decouple trigger patterns from target labels, improving robustness with minimal client-side overhead. Experiments across CIFAR-10, CIFAR-10-LT, and CIFAR-100 show consistent reductions in Attack Success Rate and improvements in Main Task Accuracy under multiple backdoor attacks and non-IID settings, demonstrating practical effectiveness and privacy-preserving properties.

Abstract

Defending backdoor attacks in Federated Learning (FL) under heterogeneous client data distributions encounters limitations balancing effectiveness and privacy-preserving, while most existing methods highly rely on the assumption of homogeneous client data distributions or the availability of a clean serve dataset. In this paper, we propose an FL backdoor defense framework, named CLIP-Fed, that utilizes the zero-shot learning capabilities of vision-language pre-training models. Our scheme overcomes the limitations of Non-IID imposed on defense effectiveness by integrating pre-aggregation and post-aggregation defense strategies. CLIP-Fed aligns the knowledge of the global model and CLIP on the augmented dataset using prototype contrastive loss and Kullback-Leibler divergence, so that class prototype deviations caused by backdoor samples are ensured and the correlation between trigger patterns and target labels is eliminated. In order to balance privacy-preserving and coverage enhancement of the dataset against diverse triggers, we further construct and augment the server dataset via using the multimodal large language model and frequency analysis without any client samples. Extensive experiments on representative datasets evidence the effectiveness of CLIP-Fed. Comparing to other existing methods, CLIP-Fed achieves an average reduction in Attack Success Rate, {\em i.e.}, 2.03\% on CIFAR-10 and 1.35\% on CIFAR-10-LT, while improving average Main Task Accuracy by 7.92\% and 0.48\%, respectively. Our codes are available at https://anonymous.4open.science/r/CLIP-Fed.

A Vision-Language Pre-training Model-Guided Approach for Mitigating Backdoor Attacks in Federated Learning

TL;DR

This work tackles backdoor attacks in Federated Learning under non-IID data by introducing CLIP-Fed, a defense that leverages vision-language pre-training (CLIP) to purify the global model. It combines pre-aggregation model clustering with post-aggregation purification and augments a privacy-preserving server dataset using multimodal language models and frequency-domain perturbations to cover diverse triggers. A semantically guided rectification pathway via prototype alignment and a KL-based knowledge transfer further decouple trigger patterns from target labels, improving robustness with minimal client-side overhead. Experiments across CIFAR-10, CIFAR-10-LT, and CIFAR-100 show consistent reductions in Attack Success Rate and improvements in Main Task Accuracy under multiple backdoor attacks and non-IID settings, demonstrating practical effectiveness and privacy-preserving properties.

Abstract

Defending backdoor attacks in Federated Learning (FL) under heterogeneous client data distributions encounters limitations balancing effectiveness and privacy-preserving, while most existing methods highly rely on the assumption of homogeneous client data distributions or the availability of a clean serve dataset. In this paper, we propose an FL backdoor defense framework, named CLIP-Fed, that utilizes the zero-shot learning capabilities of vision-language pre-training models. Our scheme overcomes the limitations of Non-IID imposed on defense effectiveness by integrating pre-aggregation and post-aggregation defense strategies. CLIP-Fed aligns the knowledge of the global model and CLIP on the augmented dataset using prototype contrastive loss and Kullback-Leibler divergence, so that class prototype deviations caused by backdoor samples are ensured and the correlation between trigger patterns and target labels is eliminated. In order to balance privacy-preserving and coverage enhancement of the dataset against diverse triggers, we further construct and augment the server dataset via using the multimodal large language model and frequency analysis without any client samples. Extensive experiments on representative datasets evidence the effectiveness of CLIP-Fed. Comparing to other existing methods, CLIP-Fed achieves an average reduction in Attack Success Rate, {\em i.e.}, 2.03\% on CIFAR-10 and 1.35\% on CIFAR-10-LT, while improving average Main Task Accuracy by 7.92\% and 0.48\%, respectively. Our codes are available at https://anonymous.4open.science/r/CLIP-Fed.

Paper Structure

This paper contains 17 sections, 6 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: An overall framework of CLIP-Fed. We utilize the prior knowledge and cross-modal visual-language representation capabilities of CLIP to purify backdoor.
  • Figure 2: The framework of the proposed CLIP-Fed. CLIP-Fed contains four modules: Data Augmentation for Server Dataset aims to build the server dataset, Dynamic Model Filtering by Clustering aims to filter out malicious models before aggregation, Feature Rectification via Prototype Alignment and Global Model Knowledge Transfer aim to purify the backdoor.
  • Figure 3: The visualization of the feature representations under different attacks on CIFAR-10-LT with/without CLIP-Fed.
  • Figure 4: Comparison of original, triggered, and frequency-perturbed images.
  • Figure 5: Comparison of MSE values between the original image and the image with pixel blocks in different frequency domain intervals.
  • ...and 2 more figures