Towards Privacy-Preserving Audio Classification Systems
Bhawana Chhaglani, Jeremy Gummeson, Prashant Shenoy
TL;DR
This paper addresses the privacy risks inherent in audio classification by arguing that privacy extends beyond speech content to include speaker identity and contextual cues. It proposes a set of generalizable privacy-preserving audio features that avoid sensitive information (e.g., F0 and formants) and demonstrates their viability on the ESC-50 environmental sound dataset using a random forest classifier, achieving 92.23% accuracy. The work highlights the privacy-accuracy trade-off and the need for robust privacy evaluation, showing that privacy-preserving approaches can approach the performance of non-privacy baselines. The findings suggest a path toward privacy-aware audio sensing that retains practical utility while mitigating privacy leaks, with broad implications for privacy-preserving deployment in ambient intelligence and IoT contexts.
Abstract
Audio signals can reveal intimate details about a person's life, including their conversations, health status, emotions, location, and personal preferences. Unauthorized access or misuse of this information can have profound personal and social implications. In an era increasingly populated by devices capable of audio recording, safeguarding user privacy is a critical obligation. This work studies the ethical and privacy concerns in current audio classification systems. We discuss the challenges and research directions in designing privacy-preserving audio sensing systems. We propose privacy-preserving audio features that can be used to classify wide range of audio classes, while being privacy preserving.
