QueryIPI: Query-agnostic Indirect Prompt Injection on Coding Agents

Yuchong Xie; Zesen Liu; Mingyu Luo; Zhixiang Zhang; Kaikai Zhang; Zongjie Li; Ping Chen; Shuai Wang; Dongdong She

QueryIPI: Query-agnostic Indirect Prompt Injection on Coding Agents

Yuchong Xie, Zesen Liu, Mingyu Luo, Zhixiang Zhang, Kaikai Zhang, Zongjie Li, Ping Chen, Shuai Wang, Dongdong She

TL;DR

This work formalizes a query-agnostic indirect prompt injection (IPI) threat on coding agents in IDEs, showing that leaking an agent's internal prompt enables a constrained white-box optimization to craft malicious tool descriptions. The authors introduce QueryIPI, an automated method that iteratively mutates tool descriptions using a Mutation LLM and evaluates them with a Judge LLM to maximize a cumulative attack score across training queries. Experiments on five simulated agents and real-world transfers demonstrate high attack success rates, robustness to partial prompt knowledge, cross-LLM transferability, and stealth against standard detection metrics. The findings reveal a practical security risk posed by exposed internal prompts and underscore the need for defenses that harden tool-description channels and guardrails in LLM-based coding agents.

Abstract

Modern coding agents integrated into IDEs combine powerful tools and system-level actions, exposing a high-stakes attack surface. Existing Indirect Prompt Injection (IPI) studies focus mainly on query-specific behaviors, leading to unstable attacks with lower success rates. We identify a more severe, query-agnostic threat that remains effective across diverse user inputs. This challenge can be overcome by exploiting a common vulnerability: leakage of the agent's internal prompt, which turns the attack into a constrained white-box optimization problem. We present QueryIPI, the first query-agnostic IPI method for coding agents. QueryIPI refines malicious tool descriptions through an iterative, prompt-based process informed by the leaked internal prompt. Experiments on five simulated agents show that QueryIPI achieves up to 87 percent success, outperforming baselines, and the generated malicious descriptions also transfer to real-world systems, highlighting a practical security risk to modern LLM-based coding agents.

QueryIPI: Query-agnostic Indirect Prompt Injection on Coding Agents

TL;DR

Abstract

QueryIPI: Query-agnostic Indirect Prompt Injection on Coding Agents

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)