Table of Contents
Fetching ...

GeoBlock: Inferring Block Granularity from Dependency Geometry in Diffusion Language Models

Lipeng Wan, Junjie Ma, Jianhui Gu, Zeyang Liu, Xuyang Lu, Xuguang Lan

Abstract

Block diffusion enables efficient parallel refinement in diffusion language models, but its decoding behavior depends critically on block size. Existing block-sizing strategies rely on fixed rules or heuristic signals and do not account for the dependency geometry that determines which tokens can be safely refined together. This motivates a geometry view of diffusion decoding: \emph{regions with strong causal ordering require sequential updates, whereas semantically cohesive regions admit parallel refinement.} We introduce GeoBlock, a geometry-aware block inference framework that determines block granularity directly from attention-derived dependency geometry. Instead of relying on predefined schedules or local confidence heuristics, GeoBlock analyzes cross-token dependency patterns to identify geometrically stable refinement regions and dynamically determines appropriate block boundaries during decoding. By adapting block granularity to the dependency geometry, GeoBlock preserves the parallel efficiency of block diffusion while enforcing dependency-consistent refinement that exhibits autoregressive reliability. GeoBlock requires no additional training and integrates seamlessly into existing block diffusion architectures. Extensive experiments across multiple benchmarks show that GeoBlock reliably identifies geometry-consistent block boundaries and improves the accuracy of block diffusion with only a small additional computational budget.

GeoBlock: Inferring Block Granularity from Dependency Geometry in Diffusion Language Models

Abstract

Block diffusion enables efficient parallel refinement in diffusion language models, but its decoding behavior depends critically on block size. Existing block-sizing strategies rely on fixed rules or heuristic signals and do not account for the dependency geometry that determines which tokens can be safely refined together. This motivates a geometry view of diffusion decoding: \emph{regions with strong causal ordering require sequential updates, whereas semantically cohesive regions admit parallel refinement.} We introduce GeoBlock, a geometry-aware block inference framework that determines block granularity directly from attention-derived dependency geometry. Instead of relying on predefined schedules or local confidence heuristics, GeoBlock analyzes cross-token dependency patterns to identify geometrically stable refinement regions and dynamically determines appropriate block boundaries during decoding. By adapting block granularity to the dependency geometry, GeoBlock preserves the parallel efficiency of block diffusion while enforcing dependency-consistent refinement that exhibits autoregressive reliability. GeoBlock requires no additional training and integrates seamlessly into existing block diffusion architectures. Extensive experiments across multiple benchmarks show that GeoBlock reliably identifies geometry-consistent block boundaries and improves the accuracy of block diffusion with only a small additional computational budget.

Paper Structure

This paper contains 36 sections, 19 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: Local dependency patterns in natural language. Tokens with strong mutual interactions are highlighted using the same color. (a) A locally causal-dominated dependency pattern, where tokens follow a strong directional chain and intermediate tokens exhibit weak local interactions that do not form large, densely connected regions. Only small-size parallel updates are stable in such regions. (b) A locally parallel-dominated pattern, where tokens form dense mutual connections, enabling multi-token parallel refinement. These contrasting structures motivate adaptive block-size selection in dependency-aware diffusion decoding.
  • Figure 2: Dependency geometry of decoding under different regimes. (a) Autoregressive decoding induces a strictly sequential dependency structure, yielding a triangular attention pattern. (b) Block diffusion groups tokens into contiguous blocks, allowing bidirectional interactions within blocks while preserving causal ordering across blocks. (c) GeoBlock decomposes frontier attention into structured regions over historical tokens, candidate blocks, and future tokens, highlighting internal coupling, past conditioning, and future leakage. (d) Geo boundary selection evaluates candidate boundaries using a closure score and selects the right-shifted boundary within a tolerance of the maximum.
  • Figure 3: Accuracy versus number of function evaluations (NFE) on HumanEval, MBPP, GSM8K, and IFEval under Dream-7B and LLaDA-8B backbones.
  • Figure 4: Block boundary inference under dependency geometry during diffusion decoding. Top: Fused attention map within the frontier window, revealing internal cohesion and cross-boundary dependencies. Bottom: Block closure score as a function of candidate boundary position. The selected boundary (dashed line) corresponds to the largest candidate block whose closure score remains near-optimal, allowing stable commitment of a structurally coherent token subset.