AICAttack: Adversarial Image Captioning Attack with Attention-Based Optimization

Jiyao Li; Mingze Ni; Yifei Dong; Tianqing Zhu; Wei Liu

AICAttack: Adversarial Image Captioning Attack with Attention-Based Optimization

Jiyao Li, Mingze Ni, Yifei Dong, Tianqing Zhu, Wei Liu

TL;DR

AICAttack tackles the vulnerability of image captioning models to adversarial examples under black-box constraints by marrying an attention-guided pixel selection strategy with differential evolution optimization to perturb a small set of pixels. The method identifies high-impact regions via attention maps and optimizes RGB perturbations without gradients, achieving superior attack effectiveness on COCO and Flickr8k against SAT and BLIP compared with existing baselines. Comprehensive ablations, transferability tests to VQA models, and adversarial retraining analyses demonstrate the approach's efficiency, robustness, and potential for informing defenses. The work highlights practical threat considerations for captioning systems in real-world applications and suggests avenues for defense research and expansion to related multimodal tasks.

Abstract

Recent advances in deep learning research have shown remarkable achievements across many tasks in computer vision (CV) and natural language processing (NLP). At the intersection of CV and NLP is the problem of image captioning, where the related models' robustness against adversarial attacks has not been well studied. This paper presents a novel adversarial attack strategy, AICAttack (Attention-based Image Captioning Attack), designed to attack image captioning models through subtle perturbations on images. Operating within a black-box attack scenario, our algorithm requires no access to the target model's architecture, parameters, or gradient information. We introduce an attention-based candidate selection mechanism that identifies the optimal pixels to attack, followed by a customised differential evolution method to optimise the perturbations of pixels' RGB values. We demonstrate AICAttack's effectiveness through extensive experiments on benchmark datasets against multiple victim models. The experimental results demonstrate that our method outperforms current leading-edge techniques by achieving consistently higher attack success rates.

AICAttack: Adversarial Image Captioning Attack with Attention-Based Optimization

TL;DR

Abstract

Paper Structure (26 sections, 7 equations, 11 figures, 6 tables, 1 algorithm)

This paper contains 26 sections, 7 equations, 11 figures, 6 tables, 1 algorithm.

Introduction
Related Work
Our Proposed Attack Method
Problem Setting
Attention for Candidate Selection
Differential Evolution Optimization
Experiment and Analysis
Datasets
Victim Models and Baselines
Metrics
BLEU Score
ROUGE Score
BR-measure
Experiment Analysis
Ablation and Hyperparameters Studies
...and 11 more sections

Figures (11)

Figure 1: The Workflow of our AICAttack Algorithm for Image Captioning Attacks. The process begins by feeding the input image into the attention block, which generates attention scores. These scores are then used for attention pixel selection. During the attack optimization phase, the Differential Evolution (DE) algorithm searches for the most effective adversarial sample.
Figure 2: Attention Mechanism Illustration in a Small Cat Image Example. Highlighted regions denote attention concentration guiding the encoder-decoder network during word generation processes.
Figure 3: Examples of "Sentence-based Attack" (our proposed method) and "Word-based Attack" approach for computing attention scores. The highlighted areas represent the attention region for pixel selections.
Figure 4: Visual examples illustrating different attack strategies, accompanied by captions.
Figure 5: Drops of BLEU2 scores before and after five attack scenarios across different pixel counts.
...and 6 more figures

AICAttack: Adversarial Image Captioning Attack with Attention-Based Optimization

TL;DR

Abstract

AICAttack: Adversarial Image Captioning Attack with Attention-Based Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (11)