SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling

Ankit Gupta; Christoph Adami; Emily Dolson

SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling

Ankit Gupta, Christoph Adami, Emily Dolson

TL;DR

This work interrogates the fragility of state-of-the-art image classifiers to out-of-distribution fooling images by re-implementing the classic CPPN-Fool and Direct-Fool attacks on contemporary models and introducing SPOOF, a minimalistic, fast black-box attack that uses sparse pixel updates. SPOOF consistently delivers high-confidence misclassifications across CNNs and transformers (notably ViT-B/16) with minimal pixel changes and much lower computational cost than prior methods. Retraining with a fooling class offers only partial resilience, as SPOOF can regain high-confidence fooling under extended query budgets. Collectively, the results reveal a persistent vulnerability of modern architectures to non-semantic inputs, highlighting the gap between recognition performance and out-of-distribution robustness and motivating new defense strategies beyond traditional fine-tuning.</p>

Abstract

Deep neural networks (DNNs) excel across image recognition tasks, yet continue to exhibit overconfidence on inputs that bear no resemblance to natural images. Revisiting the "fooling images" work introduced by Nguyen et al. (2015), we re-implement both CPPN-based and direct-encoding-based evolutionary fooling attacks on modern architectures, including convolutional and transformer classifiers. Our re-implementation confirm that high-confidence fooling persists even in state-of-the-art networks, with transformer-based ViT-B/16 emerging as the most susceptible--achieving near-certain misclassifications with substantially fewer queries than convolution-based models. We then introduce SPOOF, a minimalist, consistent, and more efficient black-box attack generating high-confidence fooling images. Despite its simplicity, SPOOF generates unrecognizable fooling images with minimal pixel modifications and drastically reduced compute. Furthermore, retraining with fooling images as an additional class provides only partial resistance, as SPOOF continues to fool consistently with slightly higher query budgets--highlighting persistent fragility of modern deep classifiers.

SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling

TL;DR

Abstract

SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (23)