Table of Contents
Fetching ...

JSON Whisperer: Efficient JSON Editing with LLMs

Sarel Duanis, Asnat Greenstein-Messica, Eliya Habba

TL;DR

The paper tackles the inefficiency of LLM-based JSON editing that regenerates entire documents by advocating patch-based edits using RFC 6902 patches. It introduces EASE, an Explicitly Addressed Sequence Encoding, to make list manipulations robust by replacing index-based addressing with stable keys, enabling order-invariant patch application. Through a synthetic dataset and DSPy-driven few-shot prompting, the approach achieves token reductions of around 31% while preserving edit quality within 5% of full regeneration, with notable gains on complex and list-centric edits. The framework demonstrates practical, cost-effective improvements for real-world JSON editing tasks in AI-assisted workflows, particularly in structured data-heavy domains like film production pipelines. Overall, JSON Whisperer combines diff-based editing with a robust encoding scheme to deliver scalable, efficient, and accurate JSON modification capabilities for LLMs.

Abstract

Large language models (LLMs) can modify JSON documents through natural language commands, but current approaches regenerate entire structures for each edit, resulting in computational inefficiency. We present JSON Whisperer, a framework that enables LLMs to generate RFC 6902 diff patches-expressing only the necessary modifications-rather than complete documents. We identify two key challenges in patch-based editing: (1) LLMs often miss related updates when generating isolated patches, and (2) array manipulations require tracking index shifts across operations, which LLMs handle poorly. To address these issues, we introduce EASE (Explicitly Addressed Sequence Encoding), which transforms arrays into dictionaries with stable keys, eliminating index arithmetic complexities. Our evaluation shows that patch generation with EASE reduces token usage by 31% while maintaining edit quality within 5% of full regeneration with particular gains for complex instructions and list manipulations. The dataset is available at: https://github.com/emnlp2025/JSON-Whisperer/

JSON Whisperer: Efficient JSON Editing with LLMs

TL;DR

The paper tackles the inefficiency of LLM-based JSON editing that regenerates entire documents by advocating patch-based edits using RFC 6902 patches. It introduces EASE, an Explicitly Addressed Sequence Encoding, to make list manipulations robust by replacing index-based addressing with stable keys, enabling order-invariant patch application. Through a synthetic dataset and DSPy-driven few-shot prompting, the approach achieves token reductions of around 31% while preserving edit quality within 5% of full regeneration, with notable gains on complex and list-centric edits. The framework demonstrates practical, cost-effective improvements for real-world JSON editing tasks in AI-assisted workflows, particularly in structured data-heavy domains like film production pipelines. Overall, JSON Whisperer combines diff-based editing with a robust encoding scheme to deliver scalable, efficient, and accurate JSON modification capabilities for LLMs.

Abstract

Large language models (LLMs) can modify JSON documents through natural language commands, but current approaches regenerate entire structures for each edit, resulting in computational inefficiency. We present JSON Whisperer, a framework that enables LLMs to generate RFC 6902 diff patches-expressing only the necessary modifications-rather than complete documents. We identify two key challenges in patch-based editing: (1) LLMs often miss related updates when generating isolated patches, and (2) array manipulations require tracking index shifts across operations, which LLMs handle poorly. To address these issues, we introduce EASE (Explicitly Addressed Sequence Encoding), which transforms arrays into dictionaries with stable keys, eliminating index arithmetic complexities. Our evaluation shows that patch generation with EASE reduces token usage by 31% while maintaining edit quality within 5% of full regeneration with particular gains for complex instructions and list manipulations. The dataset is available at: https://github.com/emnlp2025/JSON-Whisperer/

Paper Structure

This paper contains 24 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: (\ref{['fig:sub1']}) Using normal list indexing, LLMs fail to account for index shifts after removing Bob, leading to an error. (\ref{['fig:sub2']}) Using EASE, stable keys ensure correct updates, making the patch process execution order-invariant.
  • Figure 2: EASE Encoding Outperforms Standard List Indexing, broken down by different request types generated by GPT-4o-Mini
  • Figure 3: Using Synthesized Few-shots provide substantial performance improvements across models
  • Figure 4: Our method achieves comparable performance within a 5% margin to full regeneration while reducing token usage by 31%