Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers
Mohammed Mehedi Hasan, Hao Li, Emad Fallahzadeh, Gopi Krishnan Rajbahadur, Bram Adams, Ahmed E. Hassan
TL;DR
The paper tackles the fragmentation arising from tool calling in FM-based AI systems by studying MCP servers at scale. It employs a hybrid methodology—general static analysis (SonarQube) plus an MCP-specific scanner (mcp-scan)—across 1,899 MCP servers (official, community, mined) to assess health, security, and maintainability, using LLM-Jury to distill patterns and baselines from prior work. Key findings show MCP servers generally sustain healthy development trajectories, but exhibit MCP-specific vulnerabilities (7.2% with various patterns, 5.5% tool poisoning) and maintainability challenges (66% with code smells, 14.4% with bugs), indicating the need for MCP-focused tooling and governance. The results highlight actionable implications for researchers, practitioners, and MCP registries to improve security auditing, tooling adoption, and ecosystem governance as MCP adoption accelerates.
Abstract
Although Foundation Models (FMs), such as GPT-4, are increasingly used in domains like finance and software engineering, reliance on textual interfaces limits these models' real-world interaction. To address this, FM providers introduced tool calling-triggering a proliferation of frameworks with distinct tool interfaces. In late 2024, Anthropic introduced the Model Context Protocol (MCP) to standardize this tool ecosystem, which has become the de facto standard with over eight million weekly SDK downloads. Despite its adoption, MCP's AI-driven, non-deterministic control flow introduces new risks to sustainability, security, and maintainability, warranting closer examination. Towards this end, we present the first large-scale empirical study of MCP servers. Using state-of-the-art health metrics and a hybrid analysis pipeline, combining a general-purpose static analysis tool with an MCP-specific scanner, we evaluate 1,899 open-source MCP servers to assess their health, security, and maintainability. Despite MCP servers demonstrating strong health metrics, we identify eight distinct vulnerabilities - only three overlapping with traditional software vulnerabilities. Additionally, 7.2% of servers contain general vulnerabilities and 5.5% exhibit MCP-specific tool poisoning. Regarding maintainability, while 66% exhibit code smells, 14.4% contain nine bug patterns overlapping with traditional open-source software projects. These findings highlight the need for MCP-specific vulnerability detection techniques while reaffirming the value of traditional analysis and refactoring practices.
