Table of Contents
Fetching ...

Chameleon: Adaptive Adversarial Agents for Scaling-Based Visual Prompt Injection in Multimodal AI Systems

M Zeeshan, Saud Satti

TL;DR

Chameleon addresses a practical security gap in Vision-Language Models by leveraging adaptive, feedback-driven perturbations that exploit image downsampling to inject hidden prompts. The framework uses an agent-like closed-loop with two optimization strategies (hill-climbing and genetic algorithm) and demonstrates high attack success (roughly 87–91%) with visually imperceptible perturbations on Gemini-2.5-Flash, causing downstream decision degradation in multimodal agentic pipelines. The work highlights the need for defenses such as multi-scale consistency checks and scale-aware training, underscoring the vulnerability of production VLM systems to adaptive, scaling-based attacks.

Abstract

Multimodal Artificial Intelligence (AI) systems, particularly Vision-Language Models (VLMs), have become integral to critical applications ranging from autonomous decision-making to automated document processing. As these systems scale, they rely heavily on preprocessing pipelines to handle diverse inputs efficiently. However, this dependency on standard preprocessing operations, specifically image downscaling, creates a significant yet often overlooked security vulnerability. While intended for computational optimization, scaling algorithms can be exploited to conceal malicious visual prompts that are invisible to human observers but become active semantic instructions once processed by the model. Current adversarial strategies remain largely static, failing to account for the dynamic nature of modern agentic workflows. To address this gap, we propose Chameleon, a novel, adaptive adversarial framework designed to expose and exploit scaling vulnerabilities in production VLMs. Unlike traditional static attacks, Chameleon employs an iterative, agent-based optimization mechanism that dynamically refines image perturbations based on the target model's real-time feedback. This allows the framework to craft highly robust adversarial examples that survive standard downscaling operations to hijack downstream execution. We evaluate Chameleon against Gemini 2.5 Flash model. Our experiments demonstrate that Chameleon achieves an Attack Success Rate (ASR) of 84.5% across varying scaling factors, significantly outperforming static baseline attacks which average only 32.1%. Furthermore, we show that these attacks effectively compromise agentic pipelines, reducing decision-making accuracy by over 45% in multi-step tasks. Finally, we discuss the implications of these vulnerabilities and propose multi-scale consistency checks as a necessary defense mechanism.

Chameleon: Adaptive Adversarial Agents for Scaling-Based Visual Prompt Injection in Multimodal AI Systems

TL;DR

Chameleon addresses a practical security gap in Vision-Language Models by leveraging adaptive, feedback-driven perturbations that exploit image downsampling to inject hidden prompts. The framework uses an agent-like closed-loop with two optimization strategies (hill-climbing and genetic algorithm) and demonstrates high attack success (roughly 87–91%) with visually imperceptible perturbations on Gemini-2.5-Flash, causing downstream decision degradation in multimodal agentic pipelines. The work highlights the need for defenses such as multi-scale consistency checks and scale-aware training, underscoring the vulnerability of production VLM systems to adaptive, scaling-based attacks.

Abstract

Multimodal Artificial Intelligence (AI) systems, particularly Vision-Language Models (VLMs), have become integral to critical applications ranging from autonomous decision-making to automated document processing. As these systems scale, they rely heavily on preprocessing pipelines to handle diverse inputs efficiently. However, this dependency on standard preprocessing operations, specifically image downscaling, creates a significant yet often overlooked security vulnerability. While intended for computational optimization, scaling algorithms can be exploited to conceal malicious visual prompts that are invisible to human observers but become active semantic instructions once processed by the model. Current adversarial strategies remain largely static, failing to account for the dynamic nature of modern agentic workflows. To address this gap, we propose Chameleon, a novel, adaptive adversarial framework designed to expose and exploit scaling vulnerabilities in production VLMs. Unlike traditional static attacks, Chameleon employs an iterative, agent-based optimization mechanism that dynamically refines image perturbations based on the target model's real-time feedback. This allows the framework to craft highly robust adversarial examples that survive standard downscaling operations to hijack downstream execution. We evaluate Chameleon against Gemini 2.5 Flash model. Our experiments demonstrate that Chameleon achieves an Attack Success Rate (ASR) of 84.5% across varying scaling factors, significantly outperforming static baseline attacks which average only 32.1%. Furthermore, we show that these attacks effectively compromise agentic pipelines, reducing decision-making accuracy by over 45% in multi-step tasks. Finally, we discuss the implications of these vulnerabilities and propose multi-scale consistency checks as a necessary defense mechanism.

Paper Structure

This paper contains 13 sections, 6 equations, 2 figures, 5 tables, 1 algorithm.

Figures (2)

  • Figure 1: Flow chart of the Chameleon adaptive adversarial attack framework. The system uses a feedback loop to optimize perturbations.
  • Figure 2: The Chameleon agent architecture integrated into the specific Multi-Agent System (MAS) pipeline.