Watertox: The Art of Simplicity in Universal Attacks A Cross-Model Framework for Robust Adversarial Generation

Zhenghao Gao; Shengjie Xu; Meixi Chen; Fangyao Zhao

Watertox: The Art of Simplicity in Universal Attacks A Cross-Model Framework for Robust Adversarial Generation

Zhenghao Gao, Shengjie Xu, Meixi Chen, Fangyao Zhao

TL;DR

Watertox tackles the challenge of cross-model transferability in adversarial attacks by combining a simple two-stage FGSM with an ensemble of diverse architectures and a voting-based aggregation. The method introduces a total loss $J_{total}$ over multiple surrogate models, a region-aware second stage, and a principled ensemble design to achieve robust, transferable perturbations, with theoretical guarantees on quality and transferability. Experimental results on ImageNet demonstrate strong base-model disruption (e.g., ConvNeXt-large from 70.6% to 16.0%) and impressive zero-shot transfer (up to 98.8% accuracy reduction on unseen architectures), outperforming NI-FGSM while preserving perceptual quality. The work offers practical implications for visual security and CAPTCHA generation, and suggests avenues for extending to other visual tasks and deeper theoretical understanding of architectural complementarity.

Abstract

Contemporary adversarial attack methods face significant limitations in cross-model transferability and practical applicability. We present Watertox, an elegant adversarial attack framework achieving remarkable effectiveness through architectural diversity and precision-controlled perturbations. Our two-stage Fast Gradient Sign Method combines uniform baseline perturbations ($ε_1 = 0.1$) with targeted enhancements ($ε_2 = 0.4$). The framework leverages an ensemble of complementary architectures, from VGG to ConvNeXt, synthesizing diverse perspectives through an innovative voting mechanism. Against state-of-the-art architectures, Watertox reduces model accuracy from 70.6% to 16.0%, with zero-shot attacks achieving up to 98.8% accuracy reduction against unseen architectures. These results establish Watertox as a significant advancement in adversarial methodologies, with promising applications in visual security systems and CAPTCHA generation.

Watertox: The Art of Simplicity in Universal Attacks A Cross-Model Framework for Robust Adversarial Generation

TL;DR

over multiple surrogate models, a region-aware second stage, and a principled ensemble design to achieve robust, transferable perturbations, with theoretical guarantees on quality and transferability. Experimental results on ImageNet demonstrate strong base-model disruption (e.g., ConvNeXt-large from 70.6% to 16.0%) and impressive zero-shot transfer (up to 98.8% accuracy reduction on unseen architectures), outperforming NI-FGSM while preserving perceptual quality. The work offers practical implications for visual security and CAPTCHA generation, and suggests avenues for extending to other visual tasks and deeper theoretical understanding of architectural complementarity.

Abstract

) with targeted enhancements (

). The framework leverages an ensemble of complementary architectures, from VGG to ConvNeXt, synthesizing diverse perspectives through an innovative voting mechanism. Against state-of-the-art architectures, Watertox reduces model accuracy from 70.6% to 16.0%, with zero-shot attacks achieving up to 98.8% accuracy reduction against unseen architectures. These results establish Watertox as a significant advancement in adversarial methodologies, with promising applications in visual security systems and CAPTCHA generation.

Watertox: The Art of Simplicity in Universal Attacks A Cross-Model Framework for Robust Adversarial Generation

TL;DR

Abstract

Watertox: The Art of Simplicity in Universal Attacks A Cross-Model Framework for Robust Adversarial Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)