Table of Contents
Fetching ...

CrystalGAN: Learning to Discover Crystallographic Structures with Generative Adversarial Networks

Asma Nouira, Nataliya Sokolovska, Jean-Claude Crivello

TL;DR

CrystalGAN addresses the problem of generating stable ternary crystallographic compounds from binary observations by introducing a two-step cross-domain GAN with a feature-transfer stage and crystallography-informed geometric constraints. The method learns cross-domain relations between AH and BH to synthesize higher-order AH-B data, and enforces geometric validity to produce chemically plausible structures. Experiments on hydride datasets show CrystalGAN outperforms baseline GANs and geometry-free variants, suggesting a practical pathway to accelerate materials discovery for hydrogen storage. The approach is presented as generalizable to other domains requiring generation of complex, domain-constrained scientific data.

Abstract

Our main motivation is to propose an efficient approach to generate novel multi-element stable chemical compounds that can be used in real world applications. This task can be formulated as a combinatorial problem, and it takes many hours of human experts to construct, and to evaluate new data. Unsupervised learning methods such as Generative Adversarial Networks (GANs) can be efficiently used to produce new data. Cross-domain Generative Adversarial Networks were reported to achieve exciting results in image processing applications. However, in the domain of materials science, there is a need to synthesize data with higher order complexity compared to observed samples, and the state-of-the-art cross-domain GANs can not be adapted directly. In this contribution, we propose a novel GAN called CrystalGAN which generates new chemically stable crystallographic structures with increased domain complexity. We introduce an original architecture, we provide the corresponding loss functions, and we show that the CrystalGAN generates very reasonable data. We illustrate the efficiency of the proposed method on a real original problem of novel hydrides discovery that can be further used in development of hydrogen storage materials.

CrystalGAN: Learning to Discover Crystallographic Structures with Generative Adversarial Networks

TL;DR

CrystalGAN addresses the problem of generating stable ternary crystallographic compounds from binary observations by introducing a two-step cross-domain GAN with a feature-transfer stage and crystallography-informed geometric constraints. The method learns cross-domain relations between AH and BH to synthesize higher-order AH-B data, and enforces geometric validity to produce chemically plausible structures. Experiments on hydride datasets show CrystalGAN outperforms baseline GANs and geometry-free variants, suggesting a practical pathway to accelerate materials discovery for hydrogen storage. The approach is presented as generalizable to other domains requiring generation of complex, domain-constrained scientific data.

Abstract

Our main motivation is to propose an efficient approach to generate novel multi-element stable chemical compounds that can be used in real world applications. This task can be formulated as a combinatorial problem, and it takes many hours of human experts to construct, and to evaluate new data. Unsupervised learning methods such as Generative Adversarial Networks (GANs) can be efficiently used to produce new data. Cross-domain Generative Adversarial Networks were reported to achieve exciting results in image processing applications. However, in the domain of materials science, there is a need to synthesize data with higher order complexity compared to observed samples, and the state-of-the-art cross-domain GANs can not be adapted directly. In this contribution, we propose a novel GAN called CrystalGAN which generates new chemically stable crystallographic structures with increased domain complexity. We introduce an original architecture, we provide the corresponding loss functions, and we show that the CrystalGAN generates very reasonable data. We illustrate the efficiency of the proposed method on a real original problem of novel hydrides discovery that can be further used in development of hydrogen storage materials.

Paper Structure

This paper contains 16 sections, 31 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: The CrystalGAN architecture.
  • Figure 2: Encoding of $x_{A\mathrm{H}}$ and $y_{B\mathrm{H}}$ with placeholders.
  • Figure 3: An example of a POSCAR file describing the composition of Palladium and Hydrogen, and the data representation in the CrystalGAN.
  • Figure 4: A visualization of a stable structure.
  • Figure 5: The list of the nearest neighbours (on the left); the corresponding generated POSCAR file (on the right).
  • ...and 2 more figures