Evaluating the Adversarial Robustness of Detection Transformers

Amirhossein Nazeri; Chunheng Zhao; Pierluigi Pisu

Evaluating the Adversarial Robustness of Detection Transformers

Amirhossein Nazeri, Chunheng Zhao, Pierluigi Pisu

TL;DR

This work provides the first comprehensive assessment of adversarial robustness for Detection Transformers (DETR) across white-box and black-box scenarios on COCO and KITTI. It extends classic attacks FGSM, PGD, and C&W to DETR, analyzes intra- and cross-network transferability, and introduces a DETR-specific untargeted attack that leverages intermediate decoder losses to achieve strong degradation with small perturbations. Key findings show substantial vulnerabilities in DETR akin to CNN-based detectors, strong intra-variant transferability, limited cross-network transferability to CNN detectors, and that self-attention maps are decisively altered under attack, suggesting attention mechanisms do not fully shield DETR from adversarial inputs. The results underscore the need for robust defenses in transformer-based detectors for safety-critical applications such as autonomous driving and robotics, and point to future work on broader DETR variants and defense strategies.

Abstract

Robust object detection is critical for autonomous driving and mobile robotics, where accurate detection of vehicles, pedestrians, and obstacles is essential for ensuring safety. Despite the advancements in object detection transformers (DETRs), their robustness against adversarial attacks remains underexplored. This paper presents a comprehensive evaluation of DETR model and its variants under both white-box and black-box adversarial attacks, using the MS-COCO and KITTI datasets to cover general and autonomous driving scenarios. We extend prominent white-box attack methods (FGSM, PGD, and CW) to assess DETR vulnerability, demonstrating that DETR models are significantly susceptible to adversarial attacks, similar to traditional CNN-based detectors. Our extensive transferability analysis reveals high intra-network transferability among DETR variants, but limited cross-network transferability to CNN-based models. Additionally, we propose a novel untargeted attack designed specifically for DETR, exploiting its intermediate loss functions to induce misclassification with minimal perturbations. Visualizations of self-attention feature maps provide insights into how adversarial attacks affect the internal representations of DETR models. These findings reveal critical vulnerabilities in detection transformers under standard adversarial attacks, emphasizing the need for future research to enhance the robustness of transformer-based object detectors in safety-critical applications.

Evaluating the Adversarial Robustness of Detection Transformers

TL;DR

Abstract

Evaluating the Adversarial Robustness of Detection Transformers

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)