Table of Contents
Fetching ...

DMFlow: Disordered Materials Generation by Flow Matching

Liming Wu, Rui Jiao, Qi Li, Mingze Li, Songyou Li, Shifeng Jin, Wenbing Huang

TL;DR

DMFlow tackles the generation of disordered crystals by introducing a unified representation for ordered, Substitutional Disorder (SD), and Positional Disorder (PD) alongside a Riemannian flow matching framework that enforces simplex constraints via spherical reparameterization. A novel Velocity Prediction Network (GNN) processes continuous disorder inputs and multi-position interactions to produce physically valid generation trajectories for lattice parameters, fractional coordinates, and disorder weights. A robust two-stage discretization converts continuous disorder into multi-hot atomic assignments, enabling realistic structure generation; and a COD-derived benchmark for SD, PD, and SPD structures supports CSP and DNG tasks. Empirical results show that DMFlow outperforms adapted baselines on SPD cases and maintains strong performance on SD cases, demonstrating the value of a unified, geometry-aware generative framework for disordered materials. This work paves the way for AI-driven discovery of disordered materials with tunable properties.

Abstract

The design of materials with tailored properties is crucial for technological progress. However, most deep generative models focus exclusively on perfectly ordered crystals, neglecting the important class of disordered materials. To address this gap, we introduce DMFlow, a generative framework specifically designed for disordered crystals. Our approach introduces a unified representation for ordered, Substitutionally Disordered (SD), and Positionally Disordered (PD) crystals, and employs a flow matching model to jointly generate all structural components. A key innovation is a Riemannian flow matching framework with spherical reparameterization, which ensures physically valid disorder weights on the probability simplex. The vector field is learned by a novel Graph Neural Network (GNN) that incorporates physical symmetries and a specialized message-passing scheme. Finally, a two-stage discretization procedure converts the continuous weights into multi-hot atomic assignments. To support research in this area, we release a benchmark containing SD, PD, and mixed structures curated from the Crystallography Open Database. Experiments on Crystal Structure Prediction (CSP) and De Novo Generation (DNG) tasks demonstrate that DMFlow significantly outperforms state-of-the-art baselines adapted from ordered crystal generation. We hope our work provides a foundation for the AI-driven discovery of disordered materials.

DMFlow: Disordered Materials Generation by Flow Matching

TL;DR

DMFlow tackles the generation of disordered crystals by introducing a unified representation for ordered, Substitutional Disorder (SD), and Positional Disorder (PD) alongside a Riemannian flow matching framework that enforces simplex constraints via spherical reparameterization. A novel Velocity Prediction Network (GNN) processes continuous disorder inputs and multi-position interactions to produce physically valid generation trajectories for lattice parameters, fractional coordinates, and disorder weights. A robust two-stage discretization converts continuous disorder into multi-hot atomic assignments, enabling realistic structure generation; and a COD-derived benchmark for SD, PD, and SPD structures supports CSP and DNG tasks. Empirical results show that DMFlow outperforms adapted baselines on SPD cases and maintains strong performance on SD cases, demonstrating the value of a unified, geometry-aware generative framework for disordered materials. This work paves the way for AI-driven discovery of disordered materials with tunable properties.

Abstract

The design of materials with tailored properties is crucial for technological progress. However, most deep generative models focus exclusively on perfectly ordered crystals, neglecting the important class of disordered materials. To address this gap, we introduce DMFlow, a generative framework specifically designed for disordered crystals. Our approach introduces a unified representation for ordered, Substitutionally Disordered (SD), and Positionally Disordered (PD) crystals, and employs a flow matching model to jointly generate all structural components. A key innovation is a Riemannian flow matching framework with spherical reparameterization, which ensures physically valid disorder weights on the probability simplex. The vector field is learned by a novel Graph Neural Network (GNN) that incorporates physical symmetries and a specialized message-passing scheme. Finally, a two-stage discretization procedure converts the continuous weights into multi-hot atomic assignments. To support research in this area, we release a benchmark containing SD, PD, and mixed structures curated from the Crystallography Open Database. Experiments on Crystal Structure Prediction (CSP) and De Novo Generation (DNG) tasks demonstrate that DMFlow significantly outperforms state-of-the-art baselines adapted from ordered crystal generation. We hope our work provides a foundation for the AI-driven discovery of disordered materials.
Paper Structure (42 sections, 20 equations, 9 figures, 4 tables)

This paper contains 42 sections, 20 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Ordered vs. Disordered crystals. Arrows show supercell reduction; BL/BR mark bottom-left/right.
  • Figure 2: Overall framework of DMFlow. (a) The flow matching process, which transports samples from a prior distribution to the target data distribution; balls with a white segment indicate empty occupancy, corresponding to PD. (b) Geometric representations of the distinct manifolds underlying disordered crystals, on which our conditional flow matching operates. (c) The architecture of our GNN backbone used for velocity prediction.
  • Figure 3: Ensemble voting.
  • Figure 4: CSP performance with data augmentation.
  • Figure 5: Visualization comparison of DMFlow and baseline models on the CSP task, covering both SD and PD cases. Here, RMSE = None indicates a failed structure prediction.
  • ...and 4 more figures