Variational Masked Diffusion Models

Yichi Zhang; Alex Schwing; Zhizhen Zhao

Variational Masked Diffusion Models

Yichi Zhang, Alex Schwing, Zhizhen Zhao

TL;DR

Addresses the limitation of standard masked diffusion in modeling dependencies among concurrently predicted tokens by introducing Variational Masked Diffusion (VMD). VMD injects a global latent variable ${z}$ to capture joint token distributions and derives a variational objective ${L_{VMD}}$, with a subsequent Block Diffusion extension that scales to blocks of tokens. Across synthetic data, Sudoku, and text, VMD demonstrates improved dependency modeling and generation quality, outperforming standard masked diffusion baselines and approaching autoregressive-like performance with competitive efficiency. The work provides a principled integration of variational inference into masked diffusion and releases code for reproducibility.

Abstract

Masked diffusion models have recently emerged as a flexible framework for discrete generative modeling. However, a key limitation of standard masked diffusion is its inability to effectively capture dependencies among tokens that are predicted concurrently, leading to degraded generation quality when dependencies among tokens are important. To explicitly model dependencies among tokens, we propose Variational Masked Diffusion (VMD), a framework that introduces latent variables into the masked diffusion process. Through controlled experiments on synthetic datasets, we demonstrate that VMD successfully learns dependencies that conventional masked diffusion fails to capture. We further validate the effectiveness of our approach on Sudoku puzzles and text datasets, where learning of dependencies among tokens improves global consistency. Across these domains, VMD enhances both generation quality and dependency awareness, highlighting the value of integrating variational inference into masked diffusion. Our code is available at: https://riccizz.github.io/VMD.

Variational Masked Diffusion Models

TL;DR

Abstract

Variational Masked Diffusion Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)