DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning

Kichang Lee; Yujin Shin; Jonghyuk Yun; Songkuk Kim; Jun Han; JeongGil Ko

DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning

Kichang Lee, Yujin Shin, Jonghyuk Yun, Songkuk Kim, Jun Han, JeongGil Ko

TL;DR

DeTrigger introduces a gradient-centric defense for federated learning to detect and mitigate backdoor attacks at scale. By exploiting adversarial perturbation principles and applying temperature scaling to gradient signals, it isolates backdoor triggers and prunes the corresponding activations, preserving benign knowledge. Across four public datasets and multiple models, DeTrigger achieves substantial mitigation (up to 98.9%) with a dramatic speedup in detection (up to 251×) compared with traditional defenses, while maintaining global model accuracy. The framework combines a gradient preprocessing pipeline, total-variation and transferability-based detection, and a targeted pruning mechanism, demonstrating practical, scalable protection for federated learning in mobile and embedded environments.

Abstract

Federated Learning (FL) enables collaborative model training across distributed devices while preserving local data privacy, making it ideal for mobile and embedded systems. However, the decentralized nature of FL also opens vulnerabilities to model poisoning attacks, particularly backdoor attacks, where adversaries implant trigger patterns to manipulate model predictions. In this paper, we propose DeTrigger, a scalable and efficient backdoor-robust federated learning framework that leverages insights from adversarial attack methodologies. By employing gradient analysis with temperature scaling, DeTrigger detects and isolates backdoor triggers, allowing for precise model weight pruning of backdoor activations without sacrificing benign model knowledge. Extensive evaluations across four widely used datasets demonstrate that DeTrigger achieves up to 251x faster detection than traditional methods and mitigates backdoor attacks by up to 98.9%, with minimal impact on global model accuracy. Our findings establish DeTrigger as a robust and scalable solution to protect federated learning environments against sophisticated backdoor threats.

DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning

TL;DR

Abstract

DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)