Speech privacy-preserving methods using secret key for convolutional neural network models and their robustness evaluation

Shoko Niwa; Sayaka Shiota; Hitoshi Kiya

Speech privacy-preserving methods using secret key for convolutional neural network models and their robustness evaluation

Shoko Niwa, Sayaka Shiota, Hitoshi Kiya

TL;DR

This work tackles privacy in cloud-based CNN speech processing by introducing secret-key encryption for both speech data and the first-layer kernel of the model. Three encryption primitives—Shuffling ($K_s$), Flipping ($K_f$), and ROM ($K_r$)—enable encrypted queries to be processed without decryption when the keys match, with performance degrading under non-matching keys. ROM, in particular, provides a large key space and robust privacy even for small block sizes, supported by extensive experiments across ASV, ASR, and ASC and robustness tests including phase reconstruction and key-space attacks. The approach offers a practical path to privacy-preserving speech services in untrusted cloud environments and sets the stage for extending secret-key encrypted inference to other deep-learning models.

Abstract

In this paper, we propose privacy-preserving methods with a secret key for convolutional neural network (CNN)-based models in speech processing tasks. In environments where untrusted third parties, like cloud servers, provide CNN-based systems, ensuring the privacy of speech queries becomes essential. This paper proposes encryption methods for speech queries using secret keys and a model structure that allows for encrypted queries to be accepted without decryption. Our approach introduces three types of secret keys: Shuffling, Flipping, and random orthogonal matrix (ROM). In experiments, we demonstrate that when the proposed methods are used with the correct key, identification performance did not degrade. Conversely, when an incorrect key is used, the performance significantly decreased. Particularly, with the use of ROM, we show that even with a relatively small key space, high privacy-preserving performance can be maintained many speech processing tasks. Furthermore, we also demonstrate the difficulty of recovering original speech from encrypted queries in various robustness evaluations.

Speech privacy-preserving methods using secret key for convolutional neural network models and their robustness evaluation

TL;DR

), Flipping (

), and ROM (

)—enable encrypted queries to be processed without decryption when the keys match, with performance degrading under non-matching keys. ROM, in particular, provides a large key space and robust privacy even for small block sizes, supported by extensive experiments across ASV, ASR, and ASC and robustness tests including phase reconstruction and key-space attacks. The approach offers a practical path to privacy-preserving speech services in untrusted cloud environments and sets the stage for extending secret-key encrypted inference to other deep-learning models.

Abstract

Paper Structure (22 sections, 10 equations, 19 figures, 4 tables, 6 algorithms)

This paper contains 22 sections, 10 equations, 19 figures, 4 tables, 6 algorithms.

Introduction
Privacy-preserving scenario
Proposed methods
Query encryption
Shuffling
Flipping
Random orthogonal matrix
Model encryption
Shuffling
Flipping
Random orthogonal matrix
Key space of secret key
Key generation procedure
Experiment
Evaluation of privacy-preserving performance
...and 7 more sections

Figures (19)

Figure 1: Privacy-preserving scenario
Figure 2: Encrypted models for accepting encrypted speech data
Figure 3: Examples of waveform encrypted by Shuffling
Figure 10: Examples of waveform encrypted by Flipping and ROM
Figure 19: Examples of spectrogram encrypted by Shuffling
...and 14 more figures

Speech privacy-preserving methods using secret key for convolutional neural network models and their robustness evaluation

TL;DR

Abstract

Speech privacy-preserving methods using secret key for convolutional neural network models and their robustness evaluation

Authors

TL;DR

Abstract

Table of Contents

Figures (19)