Table of Contents
Fetching ...

Periodic Materials Generation using Text-Guided Joint Diffusion Model

Kishalay Das, Subhojyoti Khastagir, Pawan Goyal, Seung-Cheol Lee, Satadeep Bhattacharjee, Niloy Ganguly

TL;DR

This work tackles the challenge of generating novel periodic crystal materials that satisfy user-provided textual criteria. It introduces TGDMat, a text-guided diffusion framework that jointly diffuses lattice parameters, atom types, and fractional coordinates within a periodic-E(3)-equivariant graph neural network backbone, while integrating textual descriptions at every denoising step via a pre-trained MatSciBERT encoder. The method jointly learns p(A, X, L | T) and demonstrates strong performance on Crystal Structure Prediction and Random Material Generation, outperforming state-of-the-art baselines with single-sample generation and reduced computational cost. The results show that leveraging global textual knowledge enhances both the quality and efficiency of material generation, enabling practical, user-guided design of stable crystal structures.

Abstract

Equivariant diffusion models have emerged as the prevailing approach for generating novel crystal materials due to their ability to leverage the physical symmetries of periodic material structures. However, current models do not effectively learn the joint distribution of atom types, fractional coordinates, and lattice structure of the crystal material in a cohesive end-to-end diffusion framework. Also, none of these models work under realistic setups, where users specify the desired characteristics that the generated structures must match. In this work, we introduce TGDMat, a novel text-guided diffusion model designed for 3D periodic material generation. Our approach integrates global structural knowledge through textual descriptions at each denoising step while jointly generating atom coordinates, types, and lattice structure using a periodic-E(3)-equivariant graph neural network (GNN). Extensive experiments using popular datasets on benchmark tasks reveal that TGDMat outperforms existing baseline methods by a good margin. Notably, for the structure prediction task, with just one generated sample, TGDMat outperforms all baseline models, highlighting the importance of text-guided diffusion. Further, in the generation task, TGDMat surpasses all baselines and their text-fusion variants, showcasing the effectiveness of the joint diffusion paradigm. Additionally, incorporating textual knowledge reduces overall training and sampling computational overhead while enhancing generative performance when utilizing real-world textual prompts from experts.

Periodic Materials Generation using Text-Guided Joint Diffusion Model

TL;DR

This work tackles the challenge of generating novel periodic crystal materials that satisfy user-provided textual criteria. It introduces TGDMat, a text-guided diffusion framework that jointly diffuses lattice parameters, atom types, and fractional coordinates within a periodic-E(3)-equivariant graph neural network backbone, while integrating textual descriptions at every denoising step via a pre-trained MatSciBERT encoder. The method jointly learns p(A, X, L | T) and demonstrates strong performance on Crystal Structure Prediction and Random Material Generation, outperforming state-of-the-art baselines with single-sample generation and reduced computational cost. The results show that leveraging global textual knowledge enhances both the quality and efficiency of material generation, enabling practical, user-guided design of stable crystal structures.

Abstract

Equivariant diffusion models have emerged as the prevailing approach for generating novel crystal materials due to their ability to leverage the physical symmetries of periodic material structures. However, current models do not effectively learn the joint distribution of atom types, fractional coordinates, and lattice structure of the crystal material in a cohesive end-to-end diffusion framework. Also, none of these models work under realistic setups, where users specify the desired characteristics that the generated structures must match. In this work, we introduce TGDMat, a novel text-guided diffusion model designed for 3D periodic material generation. Our approach integrates global structural knowledge through textual descriptions at each denoising step while jointly generating atom coordinates, types, and lattice structure using a periodic-E(3)-equivariant graph neural network (GNN). Extensive experiments using popular datasets on benchmark tasks reveal that TGDMat outperforms existing baseline methods by a good margin. Notably, for the structure prediction task, with just one generated sample, TGDMat outperforms all baseline models, highlighting the importance of text-guided diffusion. Further, in the generation task, TGDMat surpasses all baselines and their text-fusion variants, showcasing the effectiveness of the joint diffusion paradigm. Additionally, incorporating textual knowledge reduces overall training and sampling computational overhead while enhancing generative performance when utilizing real-world textual prompts from experts.

Paper Structure

This paper contains 41 sections, 24 equations, 5 figures, 13 tables, 2 algorithms.

Figures (5)

  • Figure 1: Detailed textual description generated by Robocrystallographer, less-detailed prompts by domain experts, and crystal unit cell structure of $\mathbf{BaPd_2}$.
  • Figure 2: Model Architecture of our proposed text guided diffusion model TGDMat. At $t^{th}$ step of reverse diffusion, given $\textbf{M}_t=(\textbf{A}_t,\textbf{X}_t,\textbf{L}_t)$, we use periodic-E(3)-equivariant GNN model guided by contextual representation of the textual prompts ($\textbf{C}_\textbf{p}$) to generate $\textbf{M}_{t-1}=(\textbf{A}_{t-1},\textbf{X}_{t-1},\textbf{L}_{t-1})$
  • Figure 3: (a) Match Rate vs Running time (GPU Hours) for different variants of TGDMat(Long) {50 Steps $\textcolor{rgb(0,255,0)}{\star}$, 100 Steps $\textcolor{rgb(0,204,0)}{\star}$, 200 Steps $\textcolor{rgb(0,153,0)}{\star}$, 500 Steps $\textcolor{rgb(0,102,0)}{\star}$, 1K Steps $\textcolor{rgb(0,76,0)}{\star}$ }, DiffCSP $\textcolor{rgb(31,117,178)}{\blacklozenge}$ and CDVAE $\mathord{\text{✚}}$. (b) Materials sampled given the textual description of the center ground truth material $\boxed{M}$. The sampled materials are structurally similar (rotated or translated) to each other as well as the ground truth.
  • Figure 4: Detailed textual description generated by Robocrystallographer, short/less-detailed prompts by experts, and crystal unit cell structure of $\mathbf{BaPd_2}$ from Material Projects dataset. Text generated by Robocrystallographer contains both local chemical compositional information related to atom/bonds (like site coordination, geometry, polyhedral connectivity, and tilt angles) and global structural knowledge (like mineral type, space group information, symmetry, and dimensionality).The shorter prompt encodes minimal information about the material like its chemical formula, constituent elements, crystal system, and few chemical properties.
  • Figure 5: