Table of Contents
Fetching ...

Extending ResourceLink: Patterns for Large Dataset Processing in MCP Applications

Scott Frees

TL;DR

This work addresses the mismatch between large-language-model context limitations and the need for scalable, enterprise-ready reporting over large datasets. It proposes a dual-response pattern that combines lightweight preview data for LLM reasoning with out-of-band ResourceLinks to access full datasets, enabling iterative query refinement without overloading the context window. Complementary patterns for multi-tenant isolation, resource lifecycle management, and progressive discovery support production-grade deployments where rendering occurs outside the model. The approach preserves interactive, exploratory workflows while maintaining correctness and performance, and lays a groundwork for standardization through MCP enhancements and RFCs.

Abstract

Large language models translate natural language into database queries, yet context window limitations prevent direct deployment in reporting systems where complete datasets exhaust available tokens. The Model Context Protocol specification defines ResourceLink for referencing external resources, but practical patterns for implementing scalable reporting architectures remain undocumented. This paper presents patterns for building LLM-powered reporting systems that decouple query generation from data retrieval. We introduce a dual-response pattern extending ResourceLink to support both iterative query refinement and out-of-band data access, accompanied by patterns for multi-tenant security and resource lifecycle management. These patterns address fundamental challenges in LLM-driven reporting applications and provide practical guidance for developers building them.

Extending ResourceLink: Patterns for Large Dataset Processing in MCP Applications

TL;DR

This work addresses the mismatch between large-language-model context limitations and the need for scalable, enterprise-ready reporting over large datasets. It proposes a dual-response pattern that combines lightweight preview data for LLM reasoning with out-of-band ResourceLinks to access full datasets, enabling iterative query refinement without overloading the context window. Complementary patterns for multi-tenant isolation, resource lifecycle management, and progressive discovery support production-grade deployments where rendering occurs outside the model. The approach preserves interactive, exploratory workflows while maintaining correctness and performance, and lays a groundwork for standardization through MCP enhancements and RFCs.

Abstract

Large language models translate natural language into database queries, yet context window limitations prevent direct deployment in reporting systems where complete datasets exhaust available tokens. The Model Context Protocol specification defines ResourceLink for referencing external resources, but practical patterns for implementing scalable reporting architectures remain undocumented. This paper presents patterns for building LLM-powered reporting systems that decouple query generation from data retrieval. We introduce a dual-response pattern extending ResourceLink to support both iterative query refinement and out-of-band data access, accompanied by patterns for multi-tenant security and resource lifecycle management. These patterns address fundamental challenges in LLM-driven reporting applications and provide practical guidance for developers building them.

Paper Structure

This paper contains 21 sections, 1 figure.

Figures (1)

  • Figure 1: General architecture for dual response pattern integrating LLM tool calls with out-of-band data retrieval. The MCP server returns both preview samples for LLM inference and ResourceLinks for complete dataset access, after altering or augmenting query to provide multi-tenant protection and data sampling. Clients retrieve full data through RESTful endpoints, enabling reporting without consuming context.