Table of Contents
Fetching ...

Prompt Optimization Via Diffusion Language Models

Shiyu Wang, Haolin Chen, Liangwei Yang, Jielin Qiu, Rithesh Murthy, Ming Zhu, Zixiang Chen, Silvio Savarese, Caiming Xiong, Shelby Heinecke, Huan Wang

TL;DR

This work proposes a diffusion-based framework for prompt optimization that leverages Diffusion Language Models to iteratively refine system prompts through masked denoising and shows that moderate diffusion step counts provide the best balance between refinement quality and stability.

Abstract

We propose a diffusion-based framework for prompt optimization that leverages Diffusion Language Models (DLMs) to iteratively refine system prompts through masked denoising. By conditioning on interaction traces, including user queries, model responses, and optional feedback, our method enables flexible, span-level prompt updates without requiring gradient access or modifying the downstream language model. Across diverse benchmarks (e.g., $τ$-bench, SST-2, SST-5), DLM-optimized prompts consistently improve the performance of a frozen target LLM (e.g., GPT-4o-mini). We further show that moderate diffusion step counts provide the best balance between refinement quality and stability. These results highlight diffusion-based prompt optimization as a general, model-agnostic, and scalable approach for enhancing LLM performance through iterative prompt refinement.

Prompt Optimization Via Diffusion Language Models

TL;DR

This work proposes a diffusion-based framework for prompt optimization that leverages Diffusion Language Models to iteratively refine system prompts through masked denoising and shows that moderate diffusion step counts provide the best balance between refinement quality and stability.

Abstract

We propose a diffusion-based framework for prompt optimization that leverages Diffusion Language Models (DLMs) to iteratively refine system prompts through masked denoising. By conditioning on interaction traces, including user queries, model responses, and optional feedback, our method enables flexible, span-level prompt updates without requiring gradient access or modifying the downstream language model. Across diverse benchmarks (e.g., -bench, SST-2, SST-5), DLM-optimized prompts consistently improve the performance of a frozen target LLM (e.g., GPT-4o-mini). We further show that moderate diffusion step counts provide the best balance between refinement quality and stability. These results highlight diffusion-based prompt optimization as a general, model-agnostic, and scalable approach for enhancing LLM performance through iterative prompt refinement.
Paper Structure (9 sections, 1 equation, 3 figures, 1 table)

This paper contains 9 sections, 1 equation, 3 figures, 1 table.

Figures (3)

  • Figure 1: An example that illustrates prompt optimization by Diffusion Language Models (DLMs). In this case, we added additional instructions to original system prompt of $\tau$-bench airline. We leveraged DLMs to iteratively update the system prompt conditional on model output and the feedback by LLMs evaluator until all mask tokens are unmasked and predicted.
  • Figure 2: Overview of the iterative prompt optimization process using Diffusion Language Models (DLMs). The model iteratively masks and refines parts of the system prompt conditioned on the interaction trace. Feedback in the trace is optional and can usually be obtained either by user or by prompting LLMs-based evaluators.
  • Figure 3: Accuracy on SST-5 of GPT-4o-mini vs. number of diffusion steps. The red dashed line is the performance of baseline.