Table of Contents
Fetching ...

Invisible Ears at Your Fingertips: Acoustic Eavesdropping via Mouse Sensors

Mohamad Fakih, Rahul Dharmaji, Youssef Mahmoud, Halima Bouzidi, Mohammad Abdullah Al Faruque

TL;DR

The paper identifies a practical side-channel risk in high-performance optical mice where desk vibrations can be converted into audible speech. It presents Mic-E-Mouse, a multi-stage pipeline that combines nonuniform resampling, Wiener filtering, and encoder-only transformer neural filtering to reconstruct speech from raw mouse data collected in user space. Empirical results show up to +19 dB SNR improvement and roughly 42%–61% speech recognition accuracy on AudioMNIST and VCTK under controlled conditions, demonstrating a non-trivial but feasible attack. The work highlights significant security implications for consumer peripherals and proposes mitigations, underscoring the necessity of sensor-aware defenses as input devices grow more precise.

Abstract

Modern optical mouse sensors, with their advanced precision and high responsiveness, possess an often overlooked vulnerability: they can be exploited for side-channel attacks. This paper introduces Mic-E-Mouse, the first-ever side-channel attack that targets high-performance optical mouse sensors to covertly eavesdrop on users. We demonstrate that audio signals can induce subtle surface vibrations detectable by a mouse's optical sensor. Remarkably, user-space software on popular operating systems can collect and broadcast this sensitive side channel, granting attackers access to raw mouse data without requiring direct system-level permissions. Initially, the vibration signals extracted from mouse data are of poor quality due to non-uniform sampling, a non-linear frequency response, and significant quantization. To overcome these limitations, Mic-E-Mouse employs a sophisticated end-to-end data filtering pipeline that combines Wiener filtering, resampling corrections, and an innovative encoder-only spectrogram neural filtering technique. We evaluate the attack's efficacy across diverse conditions, including speaking volume, mouse polling rate and DPI, surface materials, speaker languages, and environmental noise. In controlled environments, Mic-E-Mouse improves the signal-to-noise ratio (SNR) by up to +19 dB for speech reconstruction. Furthermore, our results demonstrate a speech recognition accuracy of roughly 42% to 61% on the AudioMNIST and VCTK datasets. All our code and datasets are publicly accessible on https://sites.google.com/view/mic-e-mouse.

Invisible Ears at Your Fingertips: Acoustic Eavesdropping via Mouse Sensors

TL;DR

The paper identifies a practical side-channel risk in high-performance optical mice where desk vibrations can be converted into audible speech. It presents Mic-E-Mouse, a multi-stage pipeline that combines nonuniform resampling, Wiener filtering, and encoder-only transformer neural filtering to reconstruct speech from raw mouse data collected in user space. Empirical results show up to +19 dB SNR improvement and roughly 42%–61% speech recognition accuracy on AudioMNIST and VCTK under controlled conditions, demonstrating a non-trivial but feasible attack. The work highlights significant security implications for consumer peripherals and proposes mitigations, underscoring the necessity of sensor-aware defenses as input devices grow more precise.

Abstract

Modern optical mouse sensors, with their advanced precision and high responsiveness, possess an often overlooked vulnerability: they can be exploited for side-channel attacks. This paper introduces Mic-E-Mouse, the first-ever side-channel attack that targets high-performance optical mouse sensors to covertly eavesdrop on users. We demonstrate that audio signals can induce subtle surface vibrations detectable by a mouse's optical sensor. Remarkably, user-space software on popular operating systems can collect and broadcast this sensitive side channel, granting attackers access to raw mouse data without requiring direct system-level permissions. Initially, the vibration signals extracted from mouse data are of poor quality due to non-uniform sampling, a non-linear frequency response, and significant quantization. To overcome these limitations, Mic-E-Mouse employs a sophisticated end-to-end data filtering pipeline that combines Wiener filtering, resampling corrections, and an innovative encoder-only spectrogram neural filtering technique. We evaluate the attack's efficacy across diverse conditions, including speaking volume, mouse polling rate and DPI, surface materials, speaker languages, and environmental noise. In controlled environments, Mic-E-Mouse improves the signal-to-noise ratio (SNR) by up to +19 dB for speech reconstruction. Furthermore, our results demonstrate a speech recognition accuracy of roughly 42% to 61% on the AudioMNIST and VCTK datasets. All our code and datasets are publicly accessible on https://sites.google.com/view/mic-e-mouse.

Paper Structure

This paper contains 51 sections, 7 equations, 15 figures, 8 tables.

Figures (15)

  • Figure 1: Computer mice optical sensor fidelity trends over time. The red-shaded region indicates vulnerable sensors featuring high resolution measured in DPI (Dots-per-inch).
  • Figure 2: Overview of the Mic-E-Mouse pipeline: A victim's confidential speech is captured by benign or compromised software using surface vibrations detected by the computer mouse. The collected data is sent to the adversary's server for processing and filtering with machine learning methods to enhance the quality of the recovered audio signal.
  • Figure 3: Imaging of the CMOS sensor grid in the PMW3552 chip from a Logitech mouse
  • Figure 4: Overview of the internal systems of a mouse.
  • Figure 5: An overview of a practical attack scenario following the proposed Mic-E-Mouse pipeline with different vulnerability exploits, including, graphical application, open-source games, and web browser. Green and Red arrows depict authorized and unauthorized access, respectively.
  • ...and 10 more figures