PapMOT: Exploring Adversarial Patch Attack against Multiple Object Tracking

Jiahuan Long; Tingsong Jiang; Wen Yao; Shuai Jia; Weijia Zhang; Weien Zhou; Chao Ma; Xiaoqian Chen

PapMOT: Exploring Adversarial Patch Attack against Multiple Object Tracking

Jiahuan Long, Tingsong Jiang, Wen Yao, Shuai Jia, Weijia Zhang, Weien Zhou, Chao Ma, Xiaoqian Chen

TL;DR

PapMOT addresses the vulnerability of multi-object tracking (MOT) systems to physical adversarial patches by generating printable patches that disrupt both detection and cross-frame identity association. The method combines patch training with Expectation over Transformation (EOT) and a patch-attack phase, guided by three losses—Bounding Box Restriction ($\mathcal{L}_{bbr}$), Total Variation ($\mathcal{L}_{tv}$), and Average Precision ($\mathcal{L}_{ap}$)—into a unified objective, and includes a patch-enhancement strategy to boost temporal disruption. New integrated evaluation metrics (TASR, IOR, STASR) assess the joint impact on detection and tracking, and comprehensive experiments on MOT15/17/20 and BDD100K show strong effectiveness in both digital and physical domains, with real-world validation under varied illumination, distance, and angle. These results reveal MOT vulnerabilities and motivate the development of more robust detectors and data-association methods for safety-critical applications.

Abstract

Tracking multiple objects in a continuous video stream is crucial for many computer vision tasks. It involves detecting and associating objects with their respective identities across successive frames. Despite significant progress made in multiple object tracking (MOT), recent studies have revealed the vulnerability of existing MOT methods to adversarial attacks. Nevertheless, all of these attacks belong to digital attacks that inject pixel-level noise into input images, and are therefore ineffective in physical scenarios. To fill this gap, we propose PapMOT, which can generate physical adversarial patches against MOT for both digital and physical scenarios. Besides attacking the detection mechanism, PapMOT also optimizes a printable patch that can be detected as new targets to mislead the identity association process. Moreover, we introduce a patch enhancement strategy to further degrade the temporal consistency of tracking results across video frames, resulting in more aggressive attacks. We further develop new evaluation metrics to assess the robustness of MOT against such attacks. Extensive evaluations on multiple datasets demonstrate that our PapMOT can successfully attack various architectures of MOT trackers in digital scenarios. We also validate the effectiveness of PapMOT for physical attacks by deploying printed adversarial patches in the real world.

PapMOT: Exploring Adversarial Patch Attack against Multiple Object Tracking

TL;DR

Abstract

PapMOT: Exploring Adversarial Patch Attack against Multiple Object Tracking

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)