Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege

Peng Huang; Yao Wei; Peng Cheng; Zhongjie Ba; Li Lu; Feng Lin; Yang Wang; Kui Ren

Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege

Peng Huang, Yao Wei, Peng Cheng, Zhongjie Ba, Li Lu, Feng Lin, Yang Wang, Kui Ren

TL;DR

This paper tackles the privacy risk of voice eavesdropping by smart devices and proposes InfoMasker, a phoneme-based informational masking system that jams microphones while permitting authorized content recovery. The core idea is to construct a noise signal from phoneme sequences that mimics target speech in phonetic structure and timing, creating strong informational masking that resists denoising and human comprehension. The system comprises a registration-driven noise generator, real-time ultrasonic jamming using a transmitter array with pre-compensation, and a transformer-based denoising module for authorized recovery. Experimental results across multiple languages, devices, and real-world office scenarios show substantial degradation of ASR recognition (often below 50% WER) while enabling recoverability for authorized users, indicating practical privacy-preserving potential in controlled environments.

Abstract

The widespread smart devices raise people's concerns of being eavesdropped on. To enhance voice privacy, recent studies exploit the nonlinearity in microphone to jam audio recorders with inaudible ultrasound. However, existing solutions solely rely on energetic masking. Their simple-form noise leads to several problems, such as high energy requirements and being easily removed by speech enhancement techniques. Besides, most of these solutions do not support authorized recording, which restricts their usage scenarios. In this paper, we design an efficient yet robust system that can jam microphones while preserving authorized recording. Specifically, we propose a novel phoneme-based noise with the idea of informational masking, which can distract both machines and humans and is resistant to denoising techniques. Besides, we optimize the noise transmission strategy for broader coverage and implement a hardware prototype of our system. Experimental results show that our system can reduce the recognition accuracy of recordings to below 50\% under all tested speech recognition systems, which is much better than existing solutions.

Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege

TL;DR

Abstract

Paper Structure (33 sections, 4 equations, 28 figures, 6 tables)

This paper contains 33 sections, 4 equations, 28 figures, 6 tables.

Introduction
Preliminary
Nonlinearity in Microphone
Informational Masking
Human Auditory System and ASR
Problem Formulation
System Model
Threat Model
Design Goals
Phoneme-Based Informational Masking
Key Insight
Noise Design
System Design
System Workflow
User Registration
...and 18 more sections

Figures (28)

Figure 1: Anti-Eavesdropping with user-controlled recording.
Figure 2: Nonlinearity in microphone. The left figure shows the audio spectrum recorded by a Huawei P10 smartphone with an input of two single tones (800Hz and 1000Hz).
Figure 3: System Model.
Figure 4: Generation of phoneme-based noise.
Figure 5: Comparison of robustness of different noises against the built-in noise reduction in a Vivo Nex smartphone.
...and 23 more figures

Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege

TL;DR

Abstract

Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege

Authors

TL;DR

Abstract

Table of Contents

Figures (28)