Table of Contents
Fetching ...

Beyond the Protocol: Unveiling Attack Vectors in the Model Context Protocol (MCP) Ecosystem

Hao Song, Yiming Shen, Wenxuan Luo, Leixin Guo, Ting Chen, Jiashui Wang, Beibei Li, Xiaosong Zhang, Jiachi Chen

TL;DR

This work analyzes the security of the Model Context Protocol (MCP) ecosystem by defining four attack vectors—Tool Poisoning, Puppet, Rug Pull, and Exploitation via Malicious External Resources—that exploit the MCP trust boundary. It presents end-to-end empirical evaluations across three threads: uploading malicious MCP servers to aggregators, a user-study simulating platform defenses, and attack implementations against popular LLMs and MCP clients. Key findings show lax audit mechanisms on aggregation platforms, significant user difficulty in detecting malicious servers, and high attack success rates across major models, underscoring substantial risks for MCP-based agent systems. The study also outlines practical mitigations, including security gateways, cryptographic tool-description signing, and resource-scanning approaches, to guide the design of safer MCP ecosystems for autonomous LLM agents.

Abstract

The Model Context Protocol (MCP) is an emerging standard designed to enable seamless interaction between Large Language Model (LLM) applications and external tools or resources. Within a short period, thousands of MCP services have been developed and deployed. However, the client-server integration architecture inherent in MCP may expand the attack surface against LLM Agent systems, introducing new vulnerabilities that allow attackers to exploit by designing malicious MCP servers. In this paper, we present the first end-to-end empirical evaluation of attack vectors targeting the MCP ecosystem. We identify four categories of attacks, i.e., Tool Poisoning Attacks, Puppet Attacks, Rug Pull Attacks, and Exploitation via Malicious External Resources. To evaluate their feasibility, we conduct experiments following the typical steps of launching an attack through malicious MCP servers: upload -> download -> attack. Specifically, we first construct malicious MCP servers and successfully upload them to three widely used MCP aggregation platforms. The results indicate that current audit mechanisms are insufficient to identify and prevent these threats. Next, through a user study and interview with 20 participants, we demonstrate that users struggle to identify malicious MCP servers and often unknowingly install them from aggregator platforms. Finally, we empirically demonstrate that these attacks can trigger harmful actions within the user's local environment, such as accessing private files or controlling devices to transfer digital assets. Additionally, based on interview results, we discuss four key challenges faced by the current MCP security ecosystem. These findings underscore the urgent need for robust security mechanisms to defend against malicious MCP servers and ensure the safe deployment of increasingly autonomous LLM agents.

Beyond the Protocol: Unveiling Attack Vectors in the Model Context Protocol (MCP) Ecosystem

TL;DR

This work analyzes the security of the Model Context Protocol (MCP) ecosystem by defining four attack vectors—Tool Poisoning, Puppet, Rug Pull, and Exploitation via Malicious External Resources—that exploit the MCP trust boundary. It presents end-to-end empirical evaluations across three threads: uploading malicious MCP servers to aggregators, a user-study simulating platform defenses, and attack implementations against popular LLMs and MCP clients. Key findings show lax audit mechanisms on aggregation platforms, significant user difficulty in detecting malicious servers, and high attack success rates across major models, underscoring substantial risks for MCP-based agent systems. The study also outlines practical mitigations, including security gateways, cryptographic tool-description signing, and resource-scanning approaches, to guide the design of safer MCP ecosystems for autonomous LLM agents.

Abstract

The Model Context Protocol (MCP) is an emerging standard designed to enable seamless interaction between Large Language Model (LLM) applications and external tools or resources. Within a short period, thousands of MCP services have been developed and deployed. However, the client-server integration architecture inherent in MCP may expand the attack surface against LLM Agent systems, introducing new vulnerabilities that allow attackers to exploit by designing malicious MCP servers. In this paper, we present the first end-to-end empirical evaluation of attack vectors targeting the MCP ecosystem. We identify four categories of attacks, i.e., Tool Poisoning Attacks, Puppet Attacks, Rug Pull Attacks, and Exploitation via Malicious External Resources. To evaluate their feasibility, we conduct experiments following the typical steps of launching an attack through malicious MCP servers: upload -> download -> attack. Specifically, we first construct malicious MCP servers and successfully upload them to three widely used MCP aggregation platforms. The results indicate that current audit mechanisms are insufficient to identify and prevent these threats. Next, through a user study and interview with 20 participants, we demonstrate that users struggle to identify malicious MCP servers and often unknowingly install them from aggregator platforms. Finally, we empirically demonstrate that these attacks can trigger harmful actions within the user's local environment, such as accessing private files or controlling devices to transfer digital assets. Additionally, based on interview results, we discuss four key challenges faced by the current MCP security ecosystem. These findings underscore the urgent need for robust security mechanisms to defend against malicious MCP servers and ensure the safe deployment of increasingly autonomous LLM agents.

Paper Structure

This paper contains 56 sections, 4 equations, 12 figures, 4 tables.

Figures (12)

  • Figure 1: Overview of the Model Context Protocol (MCP) workflow.
  • Figure 2: PoC of Tool Poisoning Attack.
  • Figure 3: PoC of Puppet Attack.
  • Figure 4: The workflow of Rug Pull Attack.
  • Figure 5: PoC of Exploit via Malicious External Resources.
  • ...and 7 more figures