Table of Contents
Fetching ...

Siren Song: Manipulating Pose Estimation in XR Headsets Using Acoustic Attacks

Zijian Huang, Yicheng Zhang, Sophie Chen, Nael Abu-Ghazaleh, Jiasi Chen

TL;DR

The paper investigates acoustic injection attacks on XR headsets by perturbing IMU readings to corrupt pose estimation and downstream rendering. Using a combination of open-source pose estimation frameworks (ORB-SLAM3 and ILLIXR) and a real HoloLens 2 testbed, it demonstrates both constant and data-driven IMU perturbations that cause misleading pose, snapback, and drift-away effects. Four end-to-end proof-of-concept attacks on HoloLens 2 illustrate practical harms such as manipulating user input, clickjacking, denial of user interaction, and secure-zone invasion, with robust validation of the snapback phenomenon. The study highlights a real security risk for XR devices and motivates hardware and software mitigations to harden pose-estimation pipelines against acoustic threats in consumer XR ecosystems.

Abstract

Extended Reality (XR) experiences involve interactions between users, the real world, and virtual content. A key step to enable these experiences is the XR headset sensing and estimating the user's pose in order to accurately place and render virtual content in the real world. XR headsets use multiple sensors (e.g., cameras, inertial measurement unit) to perform pose estimation and improve its robustness, but this provides an attack surface for adversaries to interfere with the pose estimation process. In this paper, we create and study the effects of acoustic attacks that create false signals in the inertial measurement unit (IMU) on XR headsets, leading to adverse downstream effects on XR applications. We generate resonant acoustic signals on a HoloLens 2 and measure the resulting perturbations in the IMU readings, and also demonstrate both fine-grained and coarse attacks on the popular ORB-SLAM3 and an open-source XR system (ILLIXR). With the knowledge gleaned from attacking these open-source frameworks, we demonstrate four end-to-end proof-of-concept attacks on a HoloLens 2: manipulating user input, clickjacking, zone invasion, and denial of user interaction. Our experiments show that current commercial XR headsets are susceptible to acoustic attacks, raising concerns for their security.

Siren Song: Manipulating Pose Estimation in XR Headsets Using Acoustic Attacks

TL;DR

The paper investigates acoustic injection attacks on XR headsets by perturbing IMU readings to corrupt pose estimation and downstream rendering. Using a combination of open-source pose estimation frameworks (ORB-SLAM3 and ILLIXR) and a real HoloLens 2 testbed, it demonstrates both constant and data-driven IMU perturbations that cause misleading pose, snapback, and drift-away effects. Four end-to-end proof-of-concept attacks on HoloLens 2 illustrate practical harms such as manipulating user input, clickjacking, denial of user interaction, and secure-zone invasion, with robust validation of the snapback phenomenon. The study highlights a real security risk for XR devices and motivates hardware and software mitigations to harden pose-estimation pipelines against acoustic threats in consumer XR ecosystems.

Abstract

Extended Reality (XR) experiences involve interactions between users, the real world, and virtual content. A key step to enable these experiences is the XR headset sensing and estimating the user's pose in order to accurately place and render virtual content in the real world. XR headsets use multiple sensors (e.g., cameras, inertial measurement unit) to perform pose estimation and improve its robustness, but this provides an attack surface for adversaries to interfere with the pose estimation process. In this paper, we create and study the effects of acoustic attacks that create false signals in the inertial measurement unit (IMU) on XR headsets, leading to adverse downstream effects on XR applications. We generate resonant acoustic signals on a HoloLens 2 and measure the resulting perturbations in the IMU readings, and also demonstrate both fine-grained and coarse attacks on the popular ORB-SLAM3 and an open-source XR system (ILLIXR). With the knowledge gleaned from attacking these open-source frameworks, we demonstrate four end-to-end proof-of-concept attacks on a HoloLens 2: manipulating user input, clickjacking, zone invasion, and denial of user interaction. Our experiments show that current commercial XR headsets are susceptible to acoustic attacks, raising concerns for their security.

Paper Structure

This paper contains 26 sections, 2 equations, 18 figures, 1 table.

Figures (18)

  • Figure 1: Scenario overview. An XR headset is subjected to acoustic signals, which affects the pose estimation and final visual outputs.
  • Figure 2: The coordinate systems of ORB-SLAM3 (left) and IMU on the RealSense D435i camera (right) differ.
  • Figure 3: Device pose estimated by ORB-SLAM3 under constant perturbation on the IMU readings. Legend denotes perturbation magnitude. Increased magnitude of perturbations leads to increased pose error (Misleading attack), and beyond a threshold, devices default to the origin (Snapback attack).
  • Figure 4: Detailed visualization of snapback attack in ORB-SLAM3. The acoustic attack ends at time=2 and snapback occurs. The scatter points represent visual features found in the real-world environment.
  • Figure 5: Experimental setup with XR headset, speaker and sound source, and remote-control car for mobility.
  • ...and 13 more figures