Table of Contents
Fetching ...

Diffusion Models for Molecules: A Survey of Methods and Tasks

Liang Wang, Chao Song, Zhiyuan Liu, Yu Rong, Qiang Liu, Shu Wu, Liang Wang

TL;DR

This survey addresses the fragmentation in diffusion-model-based molecular generation by providing an up-to-date, systematic overview across three core diffusion formulations (DDPM, SMLD, SDE), three data modalities (2D graphs, 3D conformers, and joint 2D&3D representations), and a broad set of tasks (de novo generation, optimization, conformer generation, docking, and transition-state prediction). It introduces a novel taxonomy that organizes literature by method formulation, data modality, and task type, and discusses representative methods, datasets, and design choices. The authors highlight opportunities such as completing the 2D-3D joint modality, advancing continuous-time diffusion, and developing more expressive, equivariant architectures, while outlining future directions to better integrate molecular representations with generation. Overall, the paper provides a practical navigation framework to accelerate diffusion-based molecular design and to guide future methodological and application advances.

Abstract

Generative tasks about molecules, including but not limited to molecule generation, are crucial for drug discovery and material design, and have consistently attracted significant attention. In recent years, diffusion models have emerged as an impressive class of deep generative models, sparking extensive research and leading to numerous studies on their application to molecular generative tasks. Despite the proliferation of related work, there remains a notable lack of up-to-date and systematic surveys in this area. Particularly, due to the diversity of diffusion model formulations, molecular data modalities, and generative task types, the research landscape is challenging to navigate, hindering understanding and limiting the area's growth. To address this, this paper conducts a comprehensive survey of diffusion model-based molecular generative methods. We systematically review the research from the perspectives of methodological formulations, data modalities, and task types, offering a novel taxonomy. This survey aims to facilitate understanding and further flourishing development in this area. The relevant papers are summarized at: https://github.com/AzureLeon1/awesome-molecular-diffusion-models.

Diffusion Models for Molecules: A Survey of Methods and Tasks

TL;DR

This survey addresses the fragmentation in diffusion-model-based molecular generation by providing an up-to-date, systematic overview across three core diffusion formulations (DDPM, SMLD, SDE), three data modalities (2D graphs, 3D conformers, and joint 2D&3D representations), and a broad set of tasks (de novo generation, optimization, conformer generation, docking, and transition-state prediction). It introduces a novel taxonomy that organizes literature by method formulation, data modality, and task type, and discusses representative methods, datasets, and design choices. The authors highlight opportunities such as completing the 2D-3D joint modality, advancing continuous-time diffusion, and developing more expressive, equivariant architectures, while outlining future directions to better integrate molecular representations with generation. Overall, the paper provides a practical navigation framework to accelerate diffusion-based molecular design and to guide future methodological and application advances.

Abstract

Generative tasks about molecules, including but not limited to molecule generation, are crucial for drug discovery and material design, and have consistently attracted significant attention. In recent years, diffusion models have emerged as an impressive class of deep generative models, sparking extensive research and leading to numerous studies on their application to molecular generative tasks. Despite the proliferation of related work, there remains a notable lack of up-to-date and systematic surveys in this area. Particularly, due to the diversity of diffusion model formulations, molecular data modalities, and generative task types, the research landscape is challenging to navigate, hindering understanding and limiting the area's growth. To address this, this paper conducts a comprehensive survey of diffusion model-based molecular generative methods. We systematically review the research from the perspectives of methodological formulations, data modalities, and task types, offering a novel taxonomy. This survey aims to facilitate understanding and further flourishing development in this area. The relevant papers are summarized at: https://github.com/AzureLeon1/awesome-molecular-diffusion-models.

Paper Structure

This paper contains 18 sections, 16 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Illustration of molecular generative tasks. De novo generation designs molecules from scratch. Molecular optimization refines existing molecules to enhance desired properties while maintaining structure similarity. Conformer generation generates 3D geometries of a molecule to represent its possible spatial arrangements.
  • Figure 2: Illustration of molecular diffusion models, showcasing the forward and reverse processes. The three primary formulations—DDPM, SMLD, and SDE—are presented. Molecules can be generated in 2D space, 3D space, or jointly in 2D and 3D spaces.
  • Figure 3: A taxonomy of diffusion models for molecules with representative works.