Table of Contents
Fetching ...

Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures

Jordan Hoffmann, Louis Maestrati, Yoshihide Sawada, Jian Tang, Jean Michel Sellier, Yoshua Bengio

TL;DR

This work introduces a data-driven framework to encode and decode 3-D crystal structures by representing atoms as a smooth density field over a voxel grid and jointly training a VAE and a 3-D segmentation network. It employs two representations—single rotated unit cells and repeating lattices—to enable end-to-end learning of 3-D atomic positions and species, with capabilities for sampling, interpolation, and conditional control of density magnitudes. The study demonstrates high accuracy in unit-cell reconstructions, meaningful but more challenging results for repeating lattices, and promising latent-space manipulations, while outlining future directions toward physically stable generation and SE(3)-aware architectures. Overall, the approach provides a scalable, differentiable pathway to explore and design crystalline materials through 3-D density representations and probabilistic latent spaces, with potential impact on materials discovery and environmental applications.

Abstract

Generative models have achieved impressive results in many domains including image and text generation. In the natural sciences, generative models have led to rapid progress in automated drug discovery. Many of the current methods focus on either 1-D or 2-D representations of typically small, drug-like molecules. However, many molecules require 3-D descriptors and exceed the chemical complexity of commonly used dataset. We present a method to encode and decode the position of atoms in 3-D molecules from a dataset of nearly 50,000 stable crystal unit cells that vary from containing 1 to over 100 atoms. We construct a smooth and continuous 3-D density representation of each crystal based on the positions of different atoms. Two different neural networks were trained on a dataset of over 120,000 three-dimensional samples of single and repeating crystal structures, made by rotating the single unit cells. The first, an Encoder-Decoder pair, constructs a compressed latent space representation of each molecule and then decodes this description into an accurate reconstruction of the input. The second network segments the resulting output into atoms and assigns each atom an atomic number. By generating compressed, continuous latent spaces representations of molecules we are able to decode random samples, interpolate between two molecules, and alter known molecules.

Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures

TL;DR

This work introduces a data-driven framework to encode and decode 3-D crystal structures by representing atoms as a smooth density field over a voxel grid and jointly training a VAE and a 3-D segmentation network. It employs two representations—single rotated unit cells and repeating lattices—to enable end-to-end learning of 3-D atomic positions and species, with capabilities for sampling, interpolation, and conditional control of density magnitudes. The study demonstrates high accuracy in unit-cell reconstructions, meaningful but more challenging results for repeating lattices, and promising latent-space manipulations, while outlining future directions toward physically stable generation and SE(3)-aware architectures. Overall, the approach provides a scalable, differentiable pathway to explore and design crystalline materials through 3-D density representations and probabilistic latent spaces, with potential impact on materials discovery and environmental applications.

Abstract

Generative models have achieved impressive results in many domains including image and text generation. In the natural sciences, generative models have led to rapid progress in automated drug discovery. Many of the current methods focus on either 1-D or 2-D representations of typically small, drug-like molecules. However, many molecules require 3-D descriptors and exceed the chemical complexity of commonly used dataset. We present a method to encode and decode the position of atoms in 3-D molecules from a dataset of nearly 50,000 stable crystal unit cells that vary from containing 1 to over 100 atoms. We construct a smooth and continuous 3-D density representation of each crystal based on the positions of different atoms. Two different neural networks were trained on a dataset of over 120,000 three-dimensional samples of single and repeating crystal structures, made by rotating the single unit cells. The first, an Encoder-Decoder pair, constructs a compressed latent space representation of each molecule and then decodes this description into an accurate reconstruction of the input. The second network segments the resulting output into atoms and assigns each atom an atomic number. By generating compressed, continuous latent spaces representations of molecules we are able to decode random samples, interpolate between two molecules, and alter known molecules.

Paper Structure

This paper contains 26 sections, 7 equations, 15 figures.

Figures (15)

  • Figure 1: Examples of crystal unit cells. Each example shows the unit cell of a crystal. Different colors represent different atomic species. A red, green, and blue line represent the 3 axes of the crystal. Note that they vary in length and angle. Additionally, some unit cells have just one or two atoms (excluding equivalent positions due to translation) while others have nearly 100. Visualizations were made with Mercury macrae2008mercury.
  • Figure 2: Network Architecture. We encode and decode a $30 \times 30 \times 30$ voxel grid representing 10 Å on each side. Each voxel contains the value of the density. The output of the decoder is passed into a 3-D U-Net. We train the two models in parallel. In the schematic, we show the crystal represented as a repeated unit cell rather than a single unit cell. The black arrows indicate deterministic transformations. From the cif file, the species matrix is constructed. From this, the density matrix is computed.
  • Figure 3: Single unit cell accuracy. (A) We show the voxel wise reconstruction error (plotted with mean square error) during training. (B) For random molecules in the test set, we plot the number of true atoms and the number of recovered atoms after segmentation. (C) For the reconstructed atoms, we plot the predicted atomic number versus the atomic number of the nearest true atom. (D) We compute the distance from each true atom to the nearest predicted atom and vice-versa (orange and blue, respectively). (E) For three different crystals we plot the predicted versus reconstructed density at each voxel. We also show 2D slices through the target and prediction, along with the 3-D reconstructions. For plotting, the density on each figure is normalized between 0 and 1 though is not decoded as such.
  • Figure 4: Repeating unit cell accuracy. (A) For each position in 3-D space, we compute the difference between the truth and the reconstruction. We show different percentile bands of the reconstruction error between the target and predicted density, plotted using the mean square error (MSE). (B) For the species matrix from U-Net we ask what the error of top-1 predictions is between our predictions and the ground truth. (C) For our segmented matrices, we ask the distance from the nearest segmented atom to the ground truth. We show the distance errors by percentile for both the nearest true atom to each predicted atom and vice-versa (orange and blue, respectively). (D) After segmenting the reconstructed density maps we show the predicted and true number of atoms. (E) We plot the predicted nearest atom species versus the true species of the closest corresponding atom, as long as the distance is less than 0.5 Å. We find 65.4% are correctly predicted.
  • Figure 5: Accuracy of Model. (A) For each position in 3-D space, we plot the predicted and target density for 4 different random crystals from the test set. The red dashed line is an identity line. (B) For each of the panels in (A), we show 4 different $z$-slices through the true and predicted density fields. (C) We show the full 3-D reconstruction of the prediction and the true density field.
  • ...and 10 more figures