Table of Contents
Fetching ...

Adversarial Bounding Boxes Generation (ABBG) Attack against Visual Object Trackers

Fatemeh Nourilenjan Nokabadi, Jean-Francois Lalonde, Christian Gagné

TL;DR

This work presents a novel white-box approach to attack visual object trackers with transformer backbones using only one bounding box, and demonstrates that this simple yet effective attack outperforms existing attacks against several robust transformer trackers.

Abstract

Adversarial perturbations aim to deceive neural networks into predicting inaccurate results. For visual object trackers, adversarial attacks have been developed to generate perturbations by manipulating the outputs. However, transformer trackers predict a specific bounding box instead of an object candidate list, which limits the applicability of many existing attack scenarios. To address this issue, we present a novel white-box approach to attack visual object trackers with transformer backbones using only one bounding box. From the tracker predicted bounding box, we generate a list of adversarial bounding boxes and compute the adversarial loss for those bounding boxes. Experimental results demonstrate that our simple yet effective attack outperforms existing attacks against several robust transformer trackers, including TransT-M, ROMTrack, and MixFormer, on popular benchmark tracking datasets such as GOT-10k, UAV123, and VOT2022STS.

Adversarial Bounding Boxes Generation (ABBG) Attack against Visual Object Trackers

TL;DR

This work presents a novel white-box approach to attack visual object trackers with transformer backbones using only one bounding box, and demonstrates that this simple yet effective attack outperforms existing attacks against several robust transformer trackers.

Abstract

Adversarial perturbations aim to deceive neural networks into predicting inaccurate results. For visual object trackers, adversarial attacks have been developed to generate perturbations by manipulating the outputs. However, transformer trackers predict a specific bounding box instead of an object candidate list, which limits the applicability of many existing attack scenarios. To address this issue, we present a novel white-box approach to attack visual object trackers with transformer backbones using only one bounding box. From the tracker predicted bounding box, we generate a list of adversarial bounding boxes and compute the adversarial loss for those bounding boxes. Experimental results demonstrate that our simple yet effective attack outperforms existing attacks against several robust transformer trackers, including TransT-M, ROMTrack, and MixFormer, on popular benchmark tracking datasets such as GOT-10k, UAV123, and VOT2022STS.

Paper Structure

This paper contains 18 sections, 2 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: The adversarial robustness of transformer-based trackers, including ROMTrack cai_robust_2023, MixFormer cui_mixformer_2022, and TransT-M chen_high-performance_2023, is evaluated against white-box attacks, such as SPARK guo_spark_2020 (blue), RTAA jia_robust_2020 (black), TrackPGD nokabadi24TPGD (white), and our proposed ABBG attack (red). Our proposed ABBG method generates adversarial perturbations by manipulating only the target prediction of the bounding box. The ABBG attack is applicable as a white-box attack to a wide range of transformer-based trackers, including MixFormer and ROMTrack, while other white-box attacks, such as RTAA, SPARK, and TrackPGD, are not applicable due to the unavailability of an attack proxy.
  • Figure 2: The overview of the ABBG attack approach. The random sets of scale $\mathcal{S}$,'x-axis' translation $\mathcal{T}_x$ and 'y-axis' translation $\mathcal{T}_y$ are sampled from uniform distributions to apply on the predicted bounding boxes using Equation \ref{['eq:genB']}. The adaptive threshold $\zeta$ is computed based on the set of obtained IoUs per each iteration step of the attack.
  • Figure 3: Several examples of TransT-M chen_high-performance_2023 performance after applying the white-box attacks containing SPARK guo_spark_2020 (blue), RTAA jia_robust_2020 (black), TrackPGD nokabadi24TPGD (white), and our proposed ABBG attack (red) bounding boxes. The Green color represents the tracker's original response with no attack applied.
  • Figure 4: Several examples of TransT-M chen_high-performance_2023 performance after applying the white-box attacks containing SPARK guo_spark_2020 (blue), RTAA jia_robust_2020 (black), TrackPGD nokabadi24TPGD (white), and our proposed ABBG attack (red) bounding boxes. The Green color represents the tracker's original response with no attack applied.
  • Figure 5: Several examples of TransT-M chen_high-performance_2023 performance after applying the white-box attacks containing SPARK guo_spark_2020 (blue), RTAA jia_robust_2020 (black), TrackPGD nokabadi24TPGD (white), and our proposed ABBG attack (red) binary masks. The Green color represents the ground truth (mask).