Influencer Backdoor Attack on Semantic Segmentation
Haoheng Lan, Jindong Gu, Philip Torr, Hengshuang Zhao
TL;DR
This work introduces Influencer Backdoor Attack (IBA) on semantic segmentation, enabling misclassification of all pixels of a victim class when a trigger appears on non-victim pixels while preserving benign accuracy. It leverages segmentation-specific context via two strategies: Nearest Neighbor Injection (NNI), which places the trigger near victim pixels, and Pixel Random Labeling (PRL), which relabels random non-victim pixels to promote global context learning. Across VOC and Cityscapes with multiple architectures, IBA achieves high Attack Success Rates at modest poisoning levels, with PRL showing robustness to distant triggers and NNI excelling when trigger proximity is high; both methods maintain non-victim performance. Real-world demonstrations with printed triggers corroborate practicality, underscoring the need for robust defenses in real-world segmentation systems and highlighting directions for future research.
Abstract
When a small number of poisoned samples are injected into the training dataset of a deep neural network, the network can be induced to exhibit malicious behavior during inferences, which poses potential threats to real-world applications. While they have been intensively studied in classification, backdoor attacks on semantic segmentation have been largely overlooked. Unlike classification, semantic segmentation aims to classify every pixel within a given image. In this work, we explore backdoor attacks on segmentation models to misclassify all pixels of a victim class by injecting a specific trigger on non-victim pixels during inferences, which is dubbed Influencer Backdoor Attack (IBA). IBA is expected to maintain the classification accuracy of non-victim pixels and mislead classifications of all victim pixels in every single inference and could be easily applied to real-world scenes. Based on the context aggregation ability of segmentation models, we proposed a simple, yet effective, Nearest-Neighbor trigger injection strategy. We also introduce an innovative Pixel Random Labeling strategy which maintains optimal performance even when the trigger is placed far from the victim pixels. Our extensive experiments reveal that current segmentation models do suffer from backdoor attacks, demonstrate IBA real-world applicability, and show that our proposed techniques can further increase attack performance.
