Table of Contents
Fetching ...

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Nicholas Carlini, David Wagner

TL;DR

Targeted audio adversarial examples are shown to be feasible against state-of-the-art speech-to-text systems by perturbing input audio in a white-box, end-to-end optimization framework. The method operates on raw audio and through MFCC preprocessing and CTC-based decoding, achieving exact target transcriptions with substantial imperceptibility and outputs up to 50 characters per second. The authors compare initial and improved loss formulations, demonstrate robustness challenges and limitations, and explore non-speech to speech and silence-targeting attacks. This work establishes audio as a new domain for adversarial research, prompting development of defenses and transferability studies.

Abstract

We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

TL;DR

Targeted audio adversarial examples are shown to be feasible against state-of-the-art speech-to-text systems by perturbing input audio in a white-box, end-to-end optimization framework. The method operates on raw audio and through MFCC preprocessing and CTC-based decoding, achieving exact target transcriptions with substantial imperceptibility and outputs up to 50 characters per second. The authors compare initial and improved loss formulations, demonstrate robustness challenges and limitations, and explore non-speech to speech and silence-targeting attacks. This work establishes audio as a new domain for adversarial research, prompting development of defenses and transferability studies.

Abstract

We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.

Paper Structure

This paper contains 30 sections, 14 equations, 3 figures.

Figures (3)

  • Figure 1: Illustration of our attack: given any waveform, adding a small perturbation makes the result transcribe as any desired target phrase.
  • Figure 2: Original waveform (blue, thick line) with adversarial waveform (orange, thin line) overlaid; it is nearly impossible to notice a difference. The audio waveform was chosen randomly from the attacks generated and is 500 samples long.
  • Figure 3: CTC loss when interpolating between the original audio sample and the adversarial example (blue, solid line), compared to traveling equally far in the direction suggested by the fast gradient sign method (orange, dashed line). Adversarial examples exist far enough away from the original audio sample that solely relying on the local linearity of neural networks is insufficient to construct targeted adversarial examples.