Sovereign Context Protocol: An Open Attribution Layer for Human-Generated Content in the Age of Large Language Models
Praneel Panchigar, Torlach Rush, Matthew Canabarro
Abstract
Large Language Models (LLMs) consume vast quantities of human-generated content for both training and real-time inference, yet the creators of that content remain largely invisible in the value chain. Existing approaches to data attribution operate either at the model-internals level, tracing influence through gradient signals, or at the legal-policy level through transparency mandates and copyright litigation. Neither provides a runtime mechanism for content creators to know when, by whom, and how their work is being consumed. We introduce the Sovereign Context Protocol (SCP), an open-source protocol specification and reference architecture that functions as an attribution-aware data access layer between LLMs and human-generated content. Inspired by Anthropic's Model Context Protocol (MCP), which standardizes how LLMs connect to tools, SCP standardizes how LLMs connect to creator-owned data, with every access event logged, licensed, and attributable. SCP defines six core methods (creator profiles, semantic search, content retrieval, trust/value scoring, authenticity verification, and access auditing) exposed over both REST and MCP-compatible interfaces. We formalize the protocol's message envelope, present a threat model with five adversary classes, propose a log-proportional revenue attribution model, and report preliminary latency benchmarks from a reference implementation built on FastAPI, ChromaDB, and NetworkX. We situate SCP within the emerging regulatory landscape, including the EU AI Act's Article 53 training data transparency requirements and ongoing U.S. copyright litigation, and argue that the attribution gap requires a protocol-level intervention that makes attribution a default property of data access.
