Training-Free Guidance for Discrete Diffusion Models for Molecular Generation
Thomas J. Kerby, Kevin R. Moon
TL;DR
The paper addresses the challenge of conditioning discrete diffusion models for molecular graph generation without retraining. It proposes a training-free guidance framework for discrete diffusion by leveraging a learned reverse distribution $p_\theta(x_0|x_t)$ to compute guided updates, adapting the multinomial forward process used in DiGress. Key contributions include a concrete methodology for applying training-free guidance to discrete data and empirical demonstrations guiding node-type composition and heavy-atom molecular weight, achieving high target fidelity while maintaining molecule validity. This work enables plug-and-play conditioning of discrete foundation diffusion models and suggests broader applicability to other discrete-generation tasks, including discrete text generation.
Abstract
Training-free guidance methods for continuous data have seen an explosion of interest due to the fact that they enable foundation diffusion models to be paired with interchangable guidance models. Currently, equivalent guidance methods for discrete diffusion models are unknown. We present a framework for applying training-free guidance to discrete data and demonstrate its utility on molecular graph generation tasks using the discrete diffusion model architecture of DiGress. We pair this model with guidance functions that return the proportion of heavy atoms that are a specific atom type and the molecular weight of the heavy atoms and demonstrate our method's ability to guide the data generation.
