Vision-fused Attack: Advancing Aggressive and Stealthy Adversarial Text against Neural Machine Translation
Yanni Xue, Haojie Hao, Jiakai Wang, Qiang Sheng, Renshuai Tao, Yu Liang, Pu Feng, Xianglong Liu
TL;DR
This work tackles the vulnerability of neural machine translation to adversarial text by introducing Vision-fused Attack (VFA), a multimodal framework that leverages visual perception to produce more aggressive and humanly imperceptible perturbations. The core innovations are the Vision-merged Solution Space Enhancement (VSSE), which broadens the search space via reverse translation and text-image transformation, and Perception-retained Adversarial Text Selection (PATS), which enforces perceptual constraints through improved word replacement and a global LPIPS-based criterion. Empirical results demonstrate that VFA achieves substantial improvements in attack effectiveness (ASR and BLEU degradation) and perceptual stealth (SSIM) across multiple NMT models and even transfers to large language models, with favorable human-study outcomes. The findings underscore the importance of multimodal cues in adversarial text generation and point to needed defenses that consider vision-language couplings in text processing systems.
Abstract
While neural machine translation (NMT) models achieve success in our daily lives, they show vulnerability to adversarial attacks. Despite being harmful, these attacks also offer benefits for interpreting and enhancing NMT models, thus drawing increased research attention. However, existing studies on adversarial attacks are insufficient in both attacking ability and human imperceptibility due to their sole focus on the scope of language. This paper proposes a novel vision-fused attack (VFA) framework to acquire powerful adversarial text, i.e., more aggressive and stealthy. Regarding the attacking ability, we design the vision-merged solution space enhancement strategy to enlarge the limited semantic solution space, which enables us to search for adversarial candidates with higher attacking ability. For human imperceptibility, we propose the perception-retained adversarial text selection strategy to align the human text-reading mechanism. Thus, the finally selected adversarial text could be more deceptive. Extensive experiments on various models, including large language models (LLMs) like LLaMA and GPT-3.5, strongly support that VFA outperforms the comparisons by large margins (up to 81%/14% improvements on ASR/SSIM).
