Table of Contents
Fetching ...

Securing the Model Context Protocol (MCP): Risks, Controls, and Governance

Herman Errico, Jiquan Ngiam, Shanita Sojan

TL;DR

The paper addresses security risks introduced by the Model Context Protocol (MCP), which enables dynamic, user-driven agent interactions with external tools. It formalizes a threat model with three adversary types and presents a defense-in-depth framework comprising authentication/authorization, provenance tracking, sandboxing, inline policy enforcement, and centralized governance, implemented via a gateway. The contributions include mapping these controls to NIST AI RMF and ISO/IEC standards, detailing risk-mitigation mechanisms, and outlining open research directions on verifiable registries and privacy-preserving operations. The work aims to balance MCP productivity with security and privacy by providing end-to-end traceability, enforcement, and governance in dynamic agent ecosystems.

Abstract

The Model Context Protocol (MCP) replaces static, developer-controlled API integrations with more dynamic, user-driven agent systems, which also introduces new security risks. As MCP adoption grows across community servers and major platforms, organizations encounter threats that existing AI governance frameworks (such as NIST AI RMF and ISO/IEC 42001) do not yet cover in detail. We focus on three types of adversaries that take advantage of MCP s flexibility: content-injection attackers that embed malicious instructions into otherwise legitimate data; supply-chain attackers who distribute compromised servers; and agents who become unintentional adversaries by over-stepping their role. Based on early incidents and proof-of-concept attacks, we describe how MCP can increase the attack surface through data-driven exfiltration, tool poisoning, and cross-system privilege escalation. In response, we propose a set of practical controls, including per-user authentication with scoped authorization, provenance tracking across agent workflows, containerized sandboxing with input/output checks, inline policy enforcement with DLP and anomaly detection, and centralized governance using private registries or gateway layers. The aim is to help organizations ensure that unvetted code does not run outside a sandbox, tools are not used beyond their intended scope, data exfiltration attempts are detectable, and actions can be audited end-to-end. We close by outlining open research questions around verifiable registries, formal methods for these dynamic systems, and privacy-preserving agent operations.

Securing the Model Context Protocol (MCP): Risks, Controls, and Governance

TL;DR

The paper addresses security risks introduced by the Model Context Protocol (MCP), which enables dynamic, user-driven agent interactions with external tools. It formalizes a threat model with three adversary types and presents a defense-in-depth framework comprising authentication/authorization, provenance tracking, sandboxing, inline policy enforcement, and centralized governance, implemented via a gateway. The contributions include mapping these controls to NIST AI RMF and ISO/IEC standards, detailing risk-mitigation mechanisms, and outlining open research directions on verifiable registries and privacy-preserving operations. The work aims to balance MCP productivity with security and privacy by providing end-to-end traceability, enforcement, and governance in dynamic agent ecosystems.

Abstract

The Model Context Protocol (MCP) replaces static, developer-controlled API integrations with more dynamic, user-driven agent systems, which also introduces new security risks. As MCP adoption grows across community servers and major platforms, organizations encounter threats that existing AI governance frameworks (such as NIST AI RMF and ISO/IEC 42001) do not yet cover in detail. We focus on three types of adversaries that take advantage of MCP s flexibility: content-injection attackers that embed malicious instructions into otherwise legitimate data; supply-chain attackers who distribute compromised servers; and agents who become unintentional adversaries by over-stepping their role. Based on early incidents and proof-of-concept attacks, we describe how MCP can increase the attack surface through data-driven exfiltration, tool poisoning, and cross-system privilege escalation. In response, we propose a set of practical controls, including per-user authentication with scoped authorization, provenance tracking across agent workflows, containerized sandboxing with input/output checks, inline policy enforcement with DLP and anomaly detection, and centralized governance using private registries or gateway layers. The aim is to help organizations ensure that unvetted code does not run outside a sandbox, tools are not used beyond their intended scope, data exfiltration attempts are detectable, and actions can be audited end-to-end. We close by outlining open research questions around verifiable registries, formal methods for these dynamic systems, and privacy-preserving agent operations.

Paper Structure

This paper contains 53 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: MCP Client-Server Architecture
  • Figure 2: MCP Gateway Architecture