Table of Contents
Fetching ...

Plan-and-Write: Structure-Guided Length Control for LLMs without Model Retraining

Adewale Akinfaderin, Shreyas Subramanian, Akarsha Sehwag

TL;DR

Plan-and-Write introduces a two-phase, structure-guided prompting framework that achieves exact length control for LLM outputs without retraining. By separating planning with explicit word counting from a verification step that enforces exact length while preserving content quality, the approach yields improved length fidelity across multiple models and tasks. Evaluations on document summarization show substantial gains in length adherence (up to 37.6% MAPD improvement) with generally maintained or enhanced quality, and demonstrations on open-weight models reveal nuanced, model-dependent results. The method offers an immediately deployable solution for production environments with black-box LLMs and opens avenues for extending structure-guided prompting to other hard constraints.

Abstract

Length control in Large Language Models (LLMs) is a crucial but under-addressed challenge, with applications ranging from voice interfaces requiring concise responses to research summaries needing comprehensive outputs. Current approaches to length control, including Regularized DPO, Length-Instruction Fine Tuning, and tool-augmented methods, typically require expensive model retraining or complex inference-time tooling. This paper presents a prompt engineering methodology that enables precise length control without model retraining. Our structure-guided approach implements deliberate planning and word counting mechanisms within the prompt, encouraging the model to carefully track and adhere to specified length constraints. Comprehensive evaluations across six state-of-the-art LLMs demonstrate that our method significantly improves length fidelity for several models compared to standard prompting when applied to document summarization tasks, particularly for shorter-to-medium length constraints. The proposed technique shows varying benefits across different model architectures, with some models demonstrating up to 37.6% improvement in length adherence. Quality evaluations further reveal that our approach maintains or enhances overall output quality compared to standard prompting techniques. Our approach provides an immediately deployable solution for applications requiring precise length control, particularly valuable for production environments where model retraining is impractical or cost-prohibitive.

Plan-and-Write: Structure-Guided Length Control for LLMs without Model Retraining

TL;DR

Plan-and-Write introduces a two-phase, structure-guided prompting framework that achieves exact length control for LLM outputs without retraining. By separating planning with explicit word counting from a verification step that enforces exact length while preserving content quality, the approach yields improved length fidelity across multiple models and tasks. Evaluations on document summarization show substantial gains in length adherence (up to 37.6% MAPD improvement) with generally maintained or enhanced quality, and demonstrations on open-weight models reveal nuanced, model-dependent results. The method offers an immediately deployable solution for production environments with black-box LLMs and opens avenues for extending structure-guided prompting to other hard constraints.

Abstract

Length control in Large Language Models (LLMs) is a crucial but under-addressed challenge, with applications ranging from voice interfaces requiring concise responses to research summaries needing comprehensive outputs. Current approaches to length control, including Regularized DPO, Length-Instruction Fine Tuning, and tool-augmented methods, typically require expensive model retraining or complex inference-time tooling. This paper presents a prompt engineering methodology that enables precise length control without model retraining. Our structure-guided approach implements deliberate planning and word counting mechanisms within the prompt, encouraging the model to carefully track and adhere to specified length constraints. Comprehensive evaluations across six state-of-the-art LLMs demonstrate that our method significantly improves length fidelity for several models compared to standard prompting when applied to document summarization tasks, particularly for shorter-to-medium length constraints. The proposed technique shows varying benefits across different model architectures, with some models demonstrating up to 37.6% improvement in length adherence. Quality evaluations further reveal that our approach maintains or enhances overall output quality compared to standard prompting techniques. Our approach provides an immediately deployable solution for applications requiring precise length control, particularly valuable for production environments where model retraining is impractical or cost-prohibitive.

Paper Structure

This paper contains 31 sections, 6 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Length fidelity with vanilla prompting. The vertical axis shows the ratio of generated length to target length, with 1.0 representing perfect adherence. Each point represents an individual generation attempt, with color indicating over-generation or under-generation. Black points with error bars show the mean and standard deviation for each target length.
  • Figure 2: Length fidelity with Plan-and-Write prompting. Note the tighter clustering around the target ratio of 1.0 compared to vanilla prompting, indicating improved length control across most models.
  • Figure 3: Length fidelity comparison across four prompting strategies for Qwen 2.5 7B.
  • Figure 4: Length fidelity with vanilla prompting v2.
  • Figure 5: Length fidelity with Plan-and-Write prompting v2.
  • ...and 2 more figures