Houdini: Fooling Deep Structured Prediction Models
Moustapha Cisse, Yossi Adi, Natalia Neverova, Joseph Keshet
TL;DR
This work introduces Houdini, a universal surrogate for generating adversarial examples that directly target non-differentiable task losses in structured prediction. By formulating a differentiable surrogate tightly linked to the task loss and deriving an analytic gradient, Houdini enables efficient input perturbations and FGSM-style attacks across diverse domains. Empirically, Houdini yields higher success rates and often smaller perceptual perturbations than traditional surrogates in speech recognition, pose estimation, and semantic segmentation, including successful black-box transfer. The approach broadens adversarial evaluation beyond classification and provides a practical framework for robustness assessment of complex prediction systems.
Abstract
Generating adversarial examples is a critical step for evaluating and improving the robustness of learning machines. So far, most existing methods only work for classification and are not designed to alter the true performance measure of the problem at hand. We introduce a novel flexible approach named Houdini for generating adversarial examples specifically tailored for the final performance measure of the task considered, be it combinatorial and non-decomposable. We successfully apply Houdini to a range of applications such as speech recognition, pose estimation and semantic segmentation. In all cases, the attacks based on Houdini achieve higher success rate than those based on the traditional surrogates used to train the models while using a less perceptible adversarial perturbation.
