Conditional [MASK] Discrete Diffusion Language Model

Hyukhun Koh; Minha Jhang; Dohyung Kim; Sangmook Lee; Kyomin Jung

Conditional [MASK] Discrete Diffusion Language Model

Hyukhun Koh, Minha Jhang, Dohyung Kim, Sangmook Lee, Kyomin Jung

TL;DR

This work presents Diffusion-EAGS, a framework that fuses conditional masked language models with discrete diffusion models via a conditional Markov Random Field to achieve high-quality, diverse, and controllable text generation. It introduces two mechanisms—Entropy-Adaptive Gibbs Sampling (EAGS) for stepwise, uncertainty-driven updates, and Entropy-based Noise Scheduling (ENS) for structured denoising during training—alongside an energy-based interpretation to guarantee progressive energy reduction. Through experiments on RocStories and Paradetox against ARMs, CMLMs, and DDLMs, the approach yields superior quality-diversity tradeoffs and demonstrates robust keyword-based control. The findings suggest that integrating MLMs into diffusion frameworks, guided by entropy-aware strategies, can mitigate degeneration in conditional generation and offer practical benefits for controllable NLP applications. Limitations include exploration with other PLMs and extending to tasks beyond generation; future work could adapt the framework to encoder-decoder PLMs and broader NLP tasks.

Abstract

Although auto-regressive models excel in natural language processing, they often struggle to generate diverse text and provide limited controllability. Non-auto-regressive methods could be an alternative but often produce degenerate outputs and exhibit shortcomings in conditional generation. To address these challenges, we propose Diffusion-EAGS, a novel framework that integrates conditional masked language models into diffusion language models through the theoretical lens of a conditional Markov Random Field. In doing so, we propose entropy-adaptive Gibbs sampling and entropy-based noise scheduling to counterbalance each model's shortcomings. Experimental results show that Diffusion-EAGS outperforms baselines and achieves the best quality-diversity tradeoff, demonstrating its effectiveness in non-autoregressive text generation.

Conditional [MASK] Discrete Diffusion Language Model

TL;DR

Abstract

Conditional [MASK] Discrete Diffusion Language Model

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)