Table of Contents
Fetching ...

Interpretable Context Methodology: Folder Structure as Agentic Architecture

Jake Van Clief, David McDermott

Abstract

Current approaches to AI agent orchestration typically involve building multi-agent frameworks that manage context passing, memory, error handling, and step coordination through code. These frameworks work well for complex, concurrent systems. But for sequential workflows where a human reviews output at each step, they introduce engineering overhead that the problem does not require. This paper presents Model Workspace Protocol (MWP), a method that replaces framework-level orchestration with filesystem structure. Numbered folders represent stages. Plain markdown files carry the prompts and context that tell a single AI agent what role to play at each step. Local scripts handle the mechanical work that does not need AI at all. The result is a system where one agent, reading the right files at the right moment, does the work that would otherwise require a multi-agent framework. This approach applies ideas from Unix pipeline design, modular decomposition, multi-pass compilation, and literate programming to the specific problem of structuring context for AI agents. The protocol is open source under the MIT license.

Interpretable Context Methodology: Folder Structure as Agentic Architecture

Abstract

Current approaches to AI agent orchestration typically involve building multi-agent frameworks that manage context passing, memory, error handling, and step coordination through code. These frameworks work well for complex, concurrent systems. But for sequential workflows where a human reviews output at each step, they introduce engineering overhead that the problem does not require. This paper presents Model Workspace Protocol (MWP), a method that replaces framework-level orchestration with filesystem structure. Numbered folders represent stages. Plain markdown files carry the prompts and context that tell a single AI agent what role to play at each step. Local scripts handle the mechanical work that does not need AI at all. The result is a system where one agent, reading the right files at the right moment, does the work that would otherwise require a multi-agent framework. This approach applies ideas from Unix pipeline design, modular decomposition, multi-pass compilation, and literate programming to the specific problem of structuring context for AI agents. The protocol is open source under the MIT license.
Paper Structure (27 sections, 5 figures, 2 tables)

This paper contains 27 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The five-layer context hierarchy. Layers 0--2 provide structural routing and stage instructions. Layers 3 and 4 carry content: Layer 3 holds reference material (the factory), stable across runs; Layer 4 holds working artifacts (the product), unique to each run.
  • Figure 2: Folder structure of a typical ICM workspace, with layer annotations. Files and folders are color-coded by their role in the context hierarchy. Layer 3 material (reference) persists across runs. Layer 4 material (working artifacts) changes each time the pipeline executes.
  • Figure 3: Context window composition by stage (representative token counts from the script-to-animation workspace). The three ICM stages each deliver 2,000--8,000 focused tokens. A monolithic approach loading all stages' instructions, all reference material, and all prior outputs produces a context window exceeding 40,000 tokens, most of it irrelevant to the current task.
  • Figure 4: Pipeline flow through three stages with review gates. Each stage receives its own context (Layers 0--4), writes output to its folder, and the human reviews and optionally edits before the next stage reads it. The same model executes every stage; the folder structure controls what context it receives.
  • Figure 5: Observed frequency of human edits at each stage boundary, reported by 33 practitioners using multi-stage ICM workspaces. Intervention follows a U-shaped pattern: heavy at stage 1 (direction-setting), light at middle stages (constrained execution), heavy again at the final stage (aligning output with earlier decisions). Stage 1 editing is creative judgment. Final-stage editing is closer to debugging. Values are approximate and based on practitioner self-report through conversation, not instrumented measurement.