A Survey on Diffusion Language Models
Tianyi Li, Mingda Chen, Bowei Guo, Zhiqiang Shen
TL;DR
Diffusion Language Models (DLMs) offer a parallelizable alternative to autoregressive LLMs by denoising over either continuous embeddings or discrete tokens, enabling bidirectional context and faster inference. The survey presents a comprehensive taxonomy of continuous, discrete, and hybrid AR–diffusion paradigms, reviews training (pre-training and post-training) and inference strategies (parallel decoding, unmasking/remasking, guidance, caching, step distillation), and surveys multimodal and unified diffusion models. It documents performance trends, downstream applications across NLP, code, biology, and robotics, and analyzes key challenges such as parallelism trade-offs, infrastructure, long-context handling, and scalability, while outlining future directions. Overall, the work establishes a structured framework for understanding DLMs, highlights practical gains in efficiency and controllability, and points to avenues (e.g., agent-based reasoning, low-bit deployment, and cross-modal integration) where diffusion-based approaches may surpass traditional autoregressive methods in real-world settings.
Abstract
Diffusion Language Models (DLMs) are rapidly emerging as a powerful and promising alternative to the dominant autoregressive (AR) paradigm. By generating tokens in parallel through an iterative denoising process, DLMs possess inherent advantages in reducing inference latency and capturing bidirectional context, thereby enabling fine-grained control over the generation process. While achieving a several-fold speed-up, recent advancements have allowed DLMs to show performance comparable to their autoregressive counterparts, making them a compelling choice for various natural language processing tasks. In this survey, we provide a holistic overview of the current DLM landscape. We trace its evolution and relationship with other paradigms, such as autoregressive and masked language models, and cover both foundational principles and state-of-the-art models. Our work offers an up-to-date, comprehensive taxonomy and an in-depth analysis of current techniques, from pre-training strategies to advanced post-training methods. Another contribution of this survey is a thorough review of DLM inference strategies and optimizations, including improvements in decoding parallelism, caching mechanisms, and generation quality. We also highlight the latest approaches to multimodal extensions of DLMs and delineate their applications across various practical scenarios. Furthermore, our discussion addresses the limitations and challenges of DLMs, including efficiency, long-sequence handling, and infrastructure requirements, while outlining future research directions to sustain progress in this rapidly evolving field. Project GitHub is available at https://github.com/VILA-Lab/Awesome-DLMs.
