MF-CLIP: Leveraging CLIP as Surrogate Models for No-box Adversarial Attacks
Jiaming Zhang, Lingyu Qiu, Qi Yi, Yige Li, Jitao Sang, Changsheng Xu, Dit-Yan Yeung
TL;DR
This work tackles no-box adversarial attacks by leveraging Vision-Language Models as surrogates, identifying that vanilla CLIP has strong representations but limited discriminative margins for domain-specific attacks. It introduces MF-CLIP, a two-stage framework consisting of margin-based fine-tuning to widen inter-class margins and a generator-based adversarial-perturbation module to produce transferable examples, yielding substantial performance gains over state-of-the-art baselines. The authors support their approach with theoretical margin analysis and extensive experiments across seven datasets, multiple target architectures, and large-scale ImageNet/ViT scenarios, reporting average improvements of $15.23\%$ on standard models and $9.52\%$ on adversarially trained models. The results underscore the critical role of surrogate-model discriminative power in no-box transferability and suggest that margin-focused fine-tuning of foundation models can significantly enhance adversarial effectiveness in realistic attack settings, with implications for defense and transfer learning in multimodal models.
Abstract
The vulnerability of Deep Neural Networks (DNNs) to adversarial attacks poses a significant challenge to their deployment in safety-critical applications. While extensive research has addressed various attack scenarios, the no-box attack setting where adversaries have no prior knowledge, including access to training data of the target model, remains relatively underexplored despite its practical relevance. This work presents a systematic investigation into leveraging large-scale Vision-Language Models (VLMs), particularly CLIP, as surrogate models for executing no-box attacks. Our theoretical and empirical analyses reveal a key limitation in the execution of no-box attacks stemming from insufficient discriminative capabilities for direct application of vanilla CLIP as a surrogate model. To address this limitation, we propose MF-CLIP: a novel framework that enhances CLIP's effectiveness as a surrogate model through margin-aware feature space optimization. Comprehensive evaluations across diverse architectures and datasets demonstrate that MF-CLIP substantially advances the state-of-the-art in no-box attacks, surpassing existing baselines by 15.23% on standard models and achieving a 9.52% improvement on adversarially trained models. Our code will be made publicly available to facilitate reproducibility and future research in this direction.
