Boosting Adversarial Transferability with Low-Cost Optimization via Maximin Expected Flatness
Chunlin Qiu, Ang Li, Yiheng Duan, Shenyi Zhang, Yuanjie Zhang, Lingchen Zhao, Qian Wang
TL;DR
This work addresses the vulnerability of deep nets to transfer-based adversarial attacks by providing a principled theory of loss-surface flatness and its connection to transferability. It unifies fragmented flatness notions into a multi-order framework and proves that zeroth-order average flatness governs cross-model transferability, enabling efficient attack design. Building on this theory, the authors introduce Maximin Expected Flatness (MEF), which combines Neighborhood Conditional Sampling and Gradient Balancing Optimization to maximize flatness-aware perturbations with reduced computation. Empirically, MEF achieves state-of-the-art transferability across 22 models and 24 attacks, half the cost of prior leaders, and strong robustness against defenses and real-world APIs, underscoring the practical threat of geometry-aware attacks. The work advances both the theoretical understanding of flatness in adversarial transfer and a practical, efficient attack method with broad implications for AI security.
Abstract
Transfer-based attacks craft adversarial examples on white-box surrogate models and directly deploy them against black-box target models, offering model-agnostic and query-free threat scenarios. While flatness-enhanced methods have recently emerged to improve transferability by enhancing the loss surface flatness of adversarial examples, their divergent flatness definitions and heuristic attack designs suffer from unexamined optimization limitations and missing theoretical foundation, thus constraining their effectiveness and efficiency. This work exposes the severely imbalanced exploitation-exploration dynamics in flatness optimization, establishing the first theoretical foundation for flatness-based transferability and proposing a principled framework to overcome these optimization pitfalls. Specifically, we systematically unify fragmented flatness definitions across existing methods, revealing their imbalanced optimization limitations in over-exploration of sensitivity peaks or over-exploitation of local plateaus. To resolve these issues, we rigorously formalize average-case flatness and transferability gaps, proving that enhancing zeroth-order average-case flatness minimizes cross-model discrepancies. Building on this theory, we design a Maximin Expected Flatness (MEF) attack that enhances zeroth-order average-case flatness while balancing flatness exploration and exploitation. Extensive evaluations across 22 models and 24 current transfer-based attacks demonstrate MEF's superiority: it surpasses the state-of-the-art PGN attack by 4% in attack success rate at half the computational cost and achieves 8% higher success rate under the same budget. When combined with input augmentation, MEF attains 15% additional gains against defense-equipped models, establishing new robustness benchmarks. Our code is available at https://github.com/SignedQiu/MEFAttack.
