Network and Systems Performance Characterization of MCP-Enabled LLM Agents
Zihao Ding, Mufeng Zhu, Yao Liu
TL;DR
Model Context Protocol (MCP) enables LLMs to orchestrate external tools, but MCP-enabled workflows incur substantial prompt overhead due to rich contextual input. The authors perform a measurement-based analysis combining OpenRouter usage traces with an instrumented MCP host (Cline) to quantify token usage, monetary cost, and latency across nine LLMs and multiple MCP configurations. They find that prompt-to-completion token inflation is substantial, with MCP token ratios far lower than general traffic ($2\times$–$30\times$ lower completion-to-prompt), driven by system prompts, history, and tool observations. The study proposes optimizations such as parallel tool calls and reliable task-abort mechanisms to reduce token counts and latency, offering practical guidance for building more efficient MCP-enabled workflows.
Abstract
Model Context Protocol (MCP) has recently gained increased attention within the AI community for providing a standardized way for large language models (LLMs) to interact with external tools and services, significantly enhancing their capabilities. However, the inclusion of extensive contextual information, including system prompts, MCP tool definitions, and context histories, in MCP-enabled LLM interactions, dramatically inflates token usage. Given that LLM providers charge based on tokens, these expanded contexts can quickly escalate monetary costs and increase the computational load on LLM services. This paper presents a comprehensive measurement-based analysis of MCP-enabled interactions with LLMs, revealing trade-offs between capability, performance, and cost. We explore how different LLM models and MCP configurations impact key performance metrics such as token efficiency, monetary cost, task completion times, and task success rates, and suggest potential optimizations, including enabling parallel tool calls and implementing robust task abort mechanisms. These findings provide useful insights for developing more efficient, robust, and cost-effective MCP-enabled workflows.
