Table of Contents
Fetching ...

An Empirical Study on the Security Vulnerabilities of GPTs

Tong Wu, Weibin Wu, Zibin Zheng

TL;DR

This work investigates security vulnerabilities in GPT-based agents by formalizing their system model and attack surfaces, then conducting large-scale leakage and tool-misuse experiments across top GPTs. It reveals strong risks in expert prompts and component configurations, with dramatic information-exposure and high susceptibility to indirect prompt injection and knowledge poisoning when tools are involved. The authors propose lightweight, prompt-level defenses—protective tokens and provenance checks—that meaningfully reduce attack success rates and demonstrate their effectiveness via reverse-engineering tests. The study provides practical guidance for securing the GPTs ecosystem and lays a foundation for more robust, scalable defenses in real-world deployments.

Abstract

Equipped with various tools and knowledge, GPTs, one kind of customized AI agents based on OpenAI's large language models, have illustrated great potential in many fields, such as writing, research, and programming. Today, the number of GPTs has reached three millions, with the range of specific expert domains becoming increasingly diverse. However, given the consistent framework shared among these LLM agent applications, systemic security vulnerabilities may exist and remain underexplored. To fill this gap, we present an empirical study on the security vulnerabilities of GPTs. Building upon prior research on LLM security, we first adopt a platform-user perspective to conduct a comprehensive attack surface analysis across different system components. Then, we design a systematic and multidimensional attack suite with the explicit objectives of information leakage and tool misuse based on the attack surface analysis, thereby concretely demonstrating the security vulnerabilities that various components of GPT-based systems face. Finally, we accordingly propose defense mechanisms to address the aforementioned security vulnerabilities. By increasing the awareness of these vulnerabilities and offering critical insights into their implications, this study seeks to facilitate the secure and responsible application of GPTs while contributing to developing robust defense mechanisms that protect users and systems against malicious attacks.

An Empirical Study on the Security Vulnerabilities of GPTs

TL;DR

This work investigates security vulnerabilities in GPT-based agents by formalizing their system model and attack surfaces, then conducting large-scale leakage and tool-misuse experiments across top GPTs. It reveals strong risks in expert prompts and component configurations, with dramatic information-exposure and high susceptibility to indirect prompt injection and knowledge poisoning when tools are involved. The authors propose lightweight, prompt-level defenses—protective tokens and provenance checks—that meaningfully reduce attack success rates and demonstrate their effectiveness via reverse-engineering tests. The study provides practical guidance for securing the GPTs ecosystem and lays a foundation for more robust, scalable defenses in real-world deployments.

Abstract

Equipped with various tools and knowledge, GPTs, one kind of customized AI agents based on OpenAI's large language models, have illustrated great potential in many fields, such as writing, research, and programming. Today, the number of GPTs has reached three millions, with the range of specific expert domains becoming increasingly diverse. However, given the consistent framework shared among these LLM agent applications, systemic security vulnerabilities may exist and remain underexplored. To fill this gap, we present an empirical study on the security vulnerabilities of GPTs. Building upon prior research on LLM security, we first adopt a platform-user perspective to conduct a comprehensive attack surface analysis across different system components. Then, we design a systematic and multidimensional attack suite with the explicit objectives of information leakage and tool misuse based on the attack surface analysis, thereby concretely demonstrating the security vulnerabilities that various components of GPT-based systems face. Finally, we accordingly propose defense mechanisms to address the aforementioned security vulnerabilities. By increasing the awareness of these vulnerabilities and offering critical insights into their implications, this study seeks to facilitate the secure and responsible application of GPTs while contributing to developing robust defense mechanisms that protect users and systems against malicious attacks.

Paper Structure

This paper contains 37 sections, 7 figures, 5 tables.

Figures (7)

  • Figure 1: The framework of GPTs. LLM (currently GPT-4o, GPT-5 or GPT-5 Thinking) plays a central role in planning, with real-time short-term memory based on Chat History and Expert Prompt. Meanwhile, GPTs can invoke Tools such as creating images, searching website and executing Python code or query Knowledge as needed. With the unified framework above, GPTs can function as an AI agent—thinking, retrieving knowledge, taking action, and executing various operations.
  • Figure 2: The configure page for creating GPTs.
  • Figure 3: The attack surface and vantage points of GPTs. $A_0$: text-only attacker; $A_1$: attacker that can access external content consumed by the GPTs; $A_2$: attacker that can modify knowledge of GPTs.
  • Figure 4: Distribution of GPTs with different components.
  • Figure 5: An example of components attack on the knowledge file Part2.md of Grimoire and validation of the path of user-interface upload files.
  • ...and 2 more figures