Table of Contents
Fetching ...

Generative models for crystalline materials

Houssam Metni, Laura Ruple, Lauren N. Walters, Luca Torresi, Jonas Teufel, Henrik Schopmans, Jona Östreicher, Yumeng Zhang, Marlen Neubert, Yuri Koide, Kevin Steiner, Paul Link, Lukas Bär, Mariana Petrova, Gerbrand Ceder, Pascal Friederich

TL;DR

This review surveys the emergence of end-to-end generative modeling for crystalline materials, detailing crystal representations, data resources, and a spectrum of generative approaches from VAEs and GANs to diffusion and GFlowNets. It contrasts traditional CSP-and-screening pipelines with end-to-end methods that aim to directly propose stable, synthesizable crystal structures while respecting crystallographic symmetry. The article also covers practical considerations, including conditioning, software availability, computational cost, evaluation metrics, and post-generation synthesis workflows, and discusses emerging topics like defects, disorder, and synthesis-aware design. Collectively, it highlights the progress and remaining challenges in translating ML-generated crystal structures into experimentally realizable materials, and outlines directions for faster, symmetry-informed, and synthesis-conscious generation models with broad impact on materials discovery. The work serves as a guide for experimentalists and ML researchers to navigate representations, benchmarks, and practical deployment in inverse materials design.

Abstract

Understanding structure-property relationships in materials is fundamental in condensed matter physics and materials science. Over the past few years, machine learning (ML) has emerged as a powerful tool for advancing this understanding and accelerating materials discovery. Early ML approaches primarily focused on constructing and screening large material spaces to identify promising candidates for various applications. More recently, research efforts have increasingly shifted toward generating crystal structures using end-to-end generative models. This review analyzes the current state of generative modeling for crystal structure prediction and \textit{de novo} generation. It examines crystal representations, outlines the generative models used to design crystal structures, and evaluates their respective strengths and limitations. Furthermore, the review highlights experimental considerations for evaluating generated structures and provides recommendations for suitable existing software tools. Emerging topics, such as modeling disorder and defects, integration in advanced characterization, and incorporating synthetic feasibility constraints, are explored. Ultimately, this work aims to inform both experimental scientists looking to adapt suitable ML models to their specific circumstances and ML specialists seeking to understand the unique challenges related to inverse materials design and discovery.

Generative models for crystalline materials

TL;DR

This review surveys the emergence of end-to-end generative modeling for crystalline materials, detailing crystal representations, data resources, and a spectrum of generative approaches from VAEs and GANs to diffusion and GFlowNets. It contrasts traditional CSP-and-screening pipelines with end-to-end methods that aim to directly propose stable, synthesizable crystal structures while respecting crystallographic symmetry. The article also covers practical considerations, including conditioning, software availability, computational cost, evaluation metrics, and post-generation synthesis workflows, and discusses emerging topics like defects, disorder, and synthesis-aware design. Collectively, it highlights the progress and remaining challenges in translating ML-generated crystal structures into experimentally realizable materials, and outlines directions for faster, symmetry-informed, and synthesis-conscious generation models with broad impact on materials discovery. The work serves as a guide for experimentalists and ML researchers to navigate representations, benchmarks, and practical deployment in inverse materials design.

Abstract

Understanding structure-property relationships in materials is fundamental in condensed matter physics and materials science. Over the past few years, machine learning (ML) has emerged as a powerful tool for advancing this understanding and accelerating materials discovery. Early ML approaches primarily focused on constructing and screening large material spaces to identify promising candidates for various applications. More recently, research efforts have increasingly shifted toward generating crystal structures using end-to-end generative models. This review analyzes the current state of generative modeling for crystal structure prediction and \textit{de novo} generation. It examines crystal representations, outlines the generative models used to design crystal structures, and evaluates their respective strengths and limitations. Furthermore, the review highlights experimental considerations for evaluating generated structures and provides recommendations for suitable existing software tools. Emerging topics, such as modeling disorder and defects, integration in advanced characterization, and incorporating synthetic feasibility constraints, are explored. Ultimately, this work aims to inform both experimental scientists looking to adapt suitable ML models to their specific circumstances and ML specialists seeking to understand the unique challenges related to inverse materials design and discovery.

Paper Structure

This paper contains 18 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Overview of the Wyckoff positions, asymmetric units, and symmetry operations in a unit cell.
  • Figure 2: Overview of the unit cell and machine-readable representations of crystal structures.
  • Figure 3: a Overview of machine learning models in crystal structure generation. Left to right: Using machine learning models for simple property prediction, predictive models in conjunction with candidate proposal strategies for search-based generation, and end-to-end generation from a property conditional distribution. b Literature spotlights for crystal structure generation.
  • Figure 4: Overview of different generative modeling approaches, divided into four conceptual categories. One-shot variational approaches directly generate crystal structures based on a latent space representation. Iterative methods repetitively refine the generation outcome. Auto-regressive methods iteratively construct samples from individual components of the crystal - or from tokens in the case of large language models. Search-based approaches employ property prediction models in conjunction with candidate proposal strategies, based on heuristics and/or physical intuition.
  • Figure 5: Overview of targeted crystal structure generation. Left: Difference between constrained generation and property-conditioned generation. Middle: Architecture integrations of different conditioning methods. The hatched symbols represent components that are coupled to the specific property and which have to be retrained when switching properties. Right: example properties that can be used as conditioning targets for crystal structure generation. Valid conditioning targets range from simple, single-value properties, such as the band gap, to complex properties, including natural language property descriptions.
  • ...and 1 more figures