6DAttack: Backdoor Attacks in the 6DoF Pose Estimation
Jihui Guo, Zongmin Zhang, Zhen Sun, Yuhao Yang, Jinlin Wu, Fu Zhang, Xinlei He
TL;DR
The paper introduces 6DAttack, a backdoor framework for 6DoF pose estimation that uses 3D object triggers to steer predictions toward attacker-defined poses while preserving clean-scene performance. It designs both synthetic and real 3D triggers and demonstrates attack effectiveness across PnP-based and end-to-end pipelines on LINEMOD, YCB-Video, and CO3D, achieving 100% ASR and up to 97.7% ADD-P with minimal impact on clean accuracy. The study provides comprehensive evaluations, including a simple defense via fine-tuning that fails to eliminate the backdoor, highlighting a critical security vulnerability in current 6DoF pose estimation approaches. These findings underscore the need for robust defenses and safer design practices for 6DoF systems in robotics, AR/VR, and autonomous platforms.
Abstract
Deep learning advances have enabled accurate six-degree-of-freedom (6DoF) object pose estimation, widely used in robotics, AR/VR, and autonomous systems. However, backdoor attacks pose significant security risks. While most research focuses on 2D vision, 6DoF pose estimation remains largely unexplored. Unlike traditional backdoors that only change classes, 6DoF attacks must control continuous parameters like translation and rotation, rendering 2D methods inapplicable. We propose 6DAttack, a framework using 3D object triggers to induce controlled erroneous poses while maintaining normal behavior. Evaluations on PVNet, DenseFusion, and PoseDiffusion across LINEMOD, YCB-Video, and CO3D show high attack success rates (ASRs) without compromising clean performance. Backdoored models achieve up to 100% clean ADD accuracy and 100% ASR, with triggered samples reaching 97.70% ADD-P. Furthermore, a representative defense remains ineffective. Our findings reveal a serious, underexplored threat to 6DoF pose estimation.
