Table of Contents
Fetching ...

Progressive Refinement Regulation for Accelerating Diffusion Language Model Decoding

Lipeng Wan, Jianhui Gu, Junjie Ma, Jianguo Huang, Shiguang Sun, Siyuan Li, Xuguang Lan

TL;DR

Progressive Refinement Regulation is proposed, a progressive, trajectory-grounded refinement control framework that derives a token-level notion of empirical convergence progress from full decoding rollouts and learns a lightweight token-wise controller to regulate refinement via temperature-based distribution shaping under a progressive self-evolving training scheme.

Abstract

Diffusion language models generate text through iterative denoising under a uniform refinement rule applied to all tokens. However, tokens stabilize at different rates in practice, leading to substantial redundant refinement and motivating refinement control over the denoising process. Existing approaches typically assess refinement necessity from instantaneous, step-level signals under a fixed decoding process. In contrast, whether a token has converged is defined by how its prediction changes along its future refinement trajectory. Moreover, changing the refinement rule reshapes future refinement trajectories, which in turn determine how refinement rules should be formulated, making refinement control inherently dynamic. We propose \emph{Progressive Refinement Regulation} (PRR), a progressive, trajectory-grounded refinement control framework that derives a token-level notion of empirical convergence progress from full decoding rollouts. Based on this signal, PRR learns a lightweight token-wise controller to regulate refinement via temperature-based distribution shaping under a progressive self-evolving training scheme. Experiments show that PRR substantially accelerates diffusion language model decoding while preserving generation quality.

Progressive Refinement Regulation for Accelerating Diffusion Language Model Decoding

TL;DR

Progressive Refinement Regulation is proposed, a progressive, trajectory-grounded refinement control framework that derives a token-level notion of empirical convergence progress from full decoding rollouts and learns a lightweight token-wise controller to regulate refinement via temperature-based distribution shaping under a progressive self-evolving training scheme.

Abstract

Diffusion language models generate text through iterative denoising under a uniform refinement rule applied to all tokens. However, tokens stabilize at different rates in practice, leading to substantial redundant refinement and motivating refinement control over the denoising process. Existing approaches typically assess refinement necessity from instantaneous, step-level signals under a fixed decoding process. In contrast, whether a token has converged is defined by how its prediction changes along its future refinement trajectory. Moreover, changing the refinement rule reshapes future refinement trajectories, which in turn determine how refinement rules should be formulated, making refinement control inherently dynamic. We propose \emph{Progressive Refinement Regulation} (PRR), a progressive, trajectory-grounded refinement control framework that derives a token-level notion of empirical convergence progress from full decoding rollouts. Based on this signal, PRR learns a lightweight token-wise controller to regulate refinement via temperature-based distribution shaping under a progressive self-evolving training scheme. Experiments show that PRR substantially accelerates diffusion language model decoding while preserving generation quality.
Paper Structure (48 sections, 28 equations, 10 figures, 1 table)

This paper contains 48 sections, 28 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Empirical visualization of refinement trajectory changes under refinement control. Blue denotes unmasked tokens, while yellow marks redundant refinement where the current prediction already matches the final unmasked value; the dashed line indicates the final decoding step. Starting from uniform top-2 refinement, inducing a first-stage refinement rule produces different refinement trajectories rather than merely shortening the refinement steps. Further refinement control induced from the first-stage reshaped trajectories reorganizes the decoding trajectories again, demonstrating the dynamic nature of refinement control.
  • Figure 2: (a) Progressive self-evolving training of PRR. At each stage, a refinement controller regulates diffusion decoding to induce refinement trajectories, which are then used to construct supervision for the next-stage controller. (b) Empirical stability signal construction. For a given token, the refinement trajectory records its predictions across denoising steps. The empirical stability target $y_{i,t}$ is computed from distance-weighted suffix agreement with the final decoded token (Eq. 1), quantifying whether the current prediction has aligned with the final outcome and how persistently this alignment holds.
  • Figure 3: Accuracy versus number of function evaluations (NFE) on HumanEval, MBPP, GSM8K, and IFEval under Dream-7B and LLaDA-8B backbones.
  • Figure 4: Token-level unmasking schedule during PRR-regulated decoding on one example. At each decoding step, blue bars mark tokens selected by the top-1 refinement rule, while purple bars indicate additional tokens refined under PRR’s regulation (i.e., the expanded unmasking set after regulation). Yellow stars denote the positions newly unmasked at each step.
  • Figure 5: Trajectory-grounded convergence progress labels and controller predictions. Left: empirical convergence progress labels derived from full decoding rollouts (Eq. (1)). Light-blue masks denote positions that have already been decoded, and the colormap shows convergence progress over positions that remain under refinement. Middle: predictions of the early-stage controller. Right: predictions of the final controller. The top row corresponds to vanilla refinement trajectories, while the bottom row shows PRR-regulated trajectories from the $(N\!-\!1)$-th training stage. Panel annotations report decoded-token ratios and prediction metrics over the token-step grid.
  • ...and 5 more figures