Transferable Adversarial Attacks against ASR

Xiaoxue Gao; Zexin Li; Yiming Chen; Cong Liu; Haizhou Li

Transferable Adversarial Attacks against ASR

Xiaoxue Gao, Zexin Li, Yiming Chen, Cong Liu, Haizhou Li

TL;DR

A speech-aware gradient optimization approach (SAGO) for ASR, which forces mistranscription with minimal impact on human imperceptibility through voice activity detection rule and a speech-aware gradient-oriented optimizer is proposed.

Abstract

Given the extensive research and real-world applications of automatic speech recognition (ASR), ensuring the robustness of ASR models against minor input perturbations becomes a crucial consideration for maintaining their effectiveness in real-time scenarios. Previous explorations into ASR model robustness have predominantly revolved around evaluating accuracy on white-box settings with full access to ASR models. Nevertheless, full ASR model details are often not available in real-world applications. Therefore, evaluating the robustness of black-box ASR models is essential for a comprehensive understanding of ASR model resilience. In this regard, we thoroughly study the vulnerability of practical black-box attacks in cutting-edge ASR models and propose to employ two advanced time-domain-based transferable attacks alongside our differentiable feature extractor. We also propose a speech-aware gradient optimization approach (SAGO) for ASR, which forces mistranscription with minimal impact on human imperceptibility through voice activity detection rule and a speech-aware gradient-oriented optimizer. Our comprehensive experimental results reveal performance enhancements compared to baseline approaches across five models on two databases.

Transferable Adversarial Attacks against ASR

TL;DR

Abstract

Transferable Adversarial Attacks against ASR

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)