Table of Contents
Fetching ...

Privacy in Speech Technology

Tom Bäckström

TL;DR

Privacy in speech technology addresses the inherent privacy risks of speech signals, detailing a comprehensive threat model, attack surfaces, and attacker architectures. It surveys a spectrum of protections—from information isolation and secure processing to privacy-preserving architectures and acoustic interventions—along with methods to evaluate privacy and utility objectively and subjectively. The work also integrates perception and psychology, user-interface design, and a legal-societal lens, arguing for privacy-by-design and provable protections to build trustworthy speech systems. It concludes with urgent research directions, including consent mechanisms for streaming, streaming-appropriate privacy metrics, multi-user privacy, and robust disentanglement strategies to minimize leakage while preserving utility.

Abstract

Speech technology for communication, accessing information, and services has rapidly improved in quality. It is convenient and appealing because speech is the primary mode of communication for humans. Such technology, however, also presents proven threats to privacy. Speech is a tool for communication and it will thus inherently contain private information. Importantly, it however also contains a wealth of side information, such as information related to health, emotions, affiliations, and relationships, all of which are private. Exposing such private information can lead to serious threats such as price gouging, harassment, extortion, and stalking. This paper is a tutorial on privacy issues related to speech technology, modeling their threats, approaches for protecting users' privacy, measuring the performance of privacy-protecting methods, perception of privacy as well as societal and legal consequences. In addition to a tutorial overview, it also presents lines for further development where improvements are most urgently needed.

Privacy in Speech Technology

TL;DR

Privacy in speech technology addresses the inherent privacy risks of speech signals, detailing a comprehensive threat model, attack surfaces, and attacker architectures. It surveys a spectrum of protections—from information isolation and secure processing to privacy-preserving architectures and acoustic interventions—along with methods to evaluate privacy and utility objectively and subjectively. The work also integrates perception and psychology, user-interface design, and a legal-societal lens, arguing for privacy-by-design and provable protections to build trustworthy speech systems. It concludes with urgent research directions, including consent mechanisms for streaming, streaming-appropriate privacy metrics, multi-user privacy, and robust disentanglement strategies to minimize leakage while preserving utility.

Abstract

Speech technology for communication, accessing information, and services has rapidly improved in quality. It is convenient and appealing because speech is the primary mode of communication for humans. Such technology, however, also presents proven threats to privacy. Speech is a tool for communication and it will thus inherently contain private information. Importantly, it however also contains a wealth of side information, such as information related to health, emotions, affiliations, and relationships, all of which are private. Exposing such private information can lead to serious threats such as price gouging, harassment, extortion, and stalking. This paper is a tutorial on privacy issues related to speech technology, modeling their threats, approaches for protecting users' privacy, measuring the performance of privacy-protecting methods, perception of privacy as well as societal and legal consequences. In addition to a tutorial overview, it also presents lines for further development where improvements are most urgently needed.
Paper Structure (49 sections, 3 equations, 20 figures, 4 tables)

This paper contains 49 sections, 3 equations, 20 figures, 4 tables.

Figures (20)

  • Figure 1:
  • Figure 2: The high-level threat model of speech interaction, where target information is sent through a channel to the legitimate recipient, but consequential side-information is bundled to that message. It is a threat to privacy when an undesired recipient gains access to that target information or side information (marked by red arrows and exclamation marks).
  • Figure 3: An attack model for the evaluation of privacy-preserving anonymization, where private data is anonymized to remove private information, and the anonymized data is shared publicly. An attacker uses any available (found) data and anonymized public data to infer private information contrary to the users' preferences. Anonymized data flow is indicated by dashed lines, and the attack is represented by red lines with an exclamation mark.
  • Figure 4: Threat scenario "Cloud leak", where a user Alice accesses a (primary) remote service using a local device, but the information is shared to a secondary service contrary to preferences (red arrow and exclamation mark).
  • Figure 5: Threat scenario "False activation", where a user Alice accesses a local device, but the information is shared to a cloud service contrary to preferences (red arrow and exclamation mark).
  • ...and 15 more figures