Table of Contents
Fetching ...

WyckoffDiff -- A Generative Diffusion Model for Crystal Symmetry

Filip Ekström Kelvinius, Oskar B. Andersson, Abhijith S. Parackal, Dong Qian, Rickard Armiento, Fredrik Lindsten

TL;DR

WyckoffDiff addresses the challenge of generating crystalline materials that respect symmetry by introducing a symmetry-aware protostructure representation and a discrete diffusion framework driven by a Wyckoff-position graph neural network. The method first samples a space group and then generates occupancy vectors for constrained and unconstrained Wyckoff positions, ensuring valid protostructures while enabling fast discrete sampling. A new Fréchet Wrenformer Distance is proposed to quantify symmetry-aware similarity between generated and training protostructures, and WyckoffDiff demonstrates competitive performance against baselines and yields novel materials, including structures below the convex hull validated via DFT-based checks. The paper argues for a modular workflow that separates protostructure generation from full geometry realization, enabling targeted, symmetry-preserving exploration with practical applications in materials discovery.

Abstract

Crystalline materials often exhibit a high level of symmetry. However, most generative models do not account for symmetry, but rather model each atom without any constraints on its position or element. We propose a generative model, Wyckoff Diffusion (WyckoffDiff), which generates symmetry-based descriptions of crystals. This is enabled by considering a crystal structure representation that encodes all symmetry, and we design a novel neural network architecture which enables using this representation inside a discrete generative model framework. In addition to respecting symmetry by construction, the discrete nature of our model enables fast generation. We additionally present a new metric, Fréchet Wrenformer Distance, which captures the symmetry aspects of the materials generated, and we benchmark WyckoffDiff against recently proposed generative models for crystal generation. As a proof-of-concept study, we use WyckoffDiff to find new materials below the convex hull of thermodynamical stability.

WyckoffDiff -- A Generative Diffusion Model for Crystal Symmetry

TL;DR

WyckoffDiff addresses the challenge of generating crystalline materials that respect symmetry by introducing a symmetry-aware protostructure representation and a discrete diffusion framework driven by a Wyckoff-position graph neural network. The method first samples a space group and then generates occupancy vectors for constrained and unconstrained Wyckoff positions, ensuring valid protostructures while enabling fast discrete sampling. A new Fréchet Wrenformer Distance is proposed to quantify symmetry-aware similarity between generated and training protostructures, and WyckoffDiff demonstrates competitive performance against baselines and yields novel materials, including structures below the convex hull validated via DFT-based checks. The paper argues for a modular workflow that separates protostructure generation from full geometry realization, enabling targeted, symmetry-preserving exploration with practical applications in materials discovery.

Abstract

Crystalline materials often exhibit a high level of symmetry. However, most generative models do not account for symmetry, but rather model each atom without any constraints on its position or element. We propose a generative model, Wyckoff Diffusion (WyckoffDiff), which generates symmetry-based descriptions of crystals. This is enabled by considering a crystal structure representation that encodes all symmetry, and we design a novel neural network architecture which enables using this representation inside a discrete generative model framework. In addition to respecting symmetry by construction, the discrete nature of our model enables fast generation. We additionally present a new metric, Fréchet Wrenformer Distance, which captures the symmetry aspects of the materials generated, and we benchmark WyckoffDiff against recently proposed generative models for crystal generation. As a proof-of-concept study, we use WyckoffDiff to find new materials below the convex hull of thermodynamical stability.

Paper Structure

This paper contains 46 sections, 11 equations, 4 figures, 6 tables, 4 algorithms.

Figures (4)

  • Figure 1: Illustration of the (graph) representation of a material used in our generative model. A material of space group 62 has four Wyckoff Positions (a, b, c, d). Two of them (a and b, dark blue) has the constraint that at most one atom can occupy the position, and we hence model that as a single variable indicating which atom type that occupies the corresponding position ($\varnothing$ denoting no atom). For the other two positions (c and d, light blue), any number of atoms can occupy the position, and we hence model this as a set of variables, one for each atom type, which indicates how many of the respective atom types that are occupying the position. To the left is the state of the material at some sampling time $t$, and to the right is the prediction of the "clean" material ${\mathbf{x}}_0$ made by the neural network. For all variables, there is a corresponding row in the figure, corresponding to probability vectors, and all rows hence sum to 1.
  • Figure 2: Distribution of formation energies predicted by Wren for WyckoffDiff-zeros generated (unfiltered) protostructures and novel protostructures, relative to the training set. Q10, Q50,and Q90 are the 10th, 50th, and 90th percentiles respectively.
  • Figure 3: Selection of a three examples out of WyckoffDiff generated crystal structures close to or below the convex hull of WBM and Materials Project (MP). Displaying the energy above hull $E_{hull}\ [eV]$ relative to the convex hull of WBM and MP combined. (a) has a formation energy of $E_{form} = -2.610$ the resulting in $E_{hull}$ being negative distinctly below hull. In comparison with the convex hull structure (a) is indeed below the hull, highlighted with the green star in the phase diagram. (b) has a formation energy of $E_{form} = -2.537$, resulting in a negative $E_{hull}$ but insignificantly far from the hull. (c) has a formation energy of $E_{form} = -1.422$ which makes the $E_{hull}$ approximately zero. Comparing (b) and (c) with the convex hull shows that the structures are on the hull, indicated by the smaller stars.
  • Figure 4: Distribution of formation energies predicted by Wren for, (unfiltered) generated protostructures, novel generated protostructures, relative to the training set for the model. Protostructures are generated by (a) WyckoffDiff-marginal (b) WyckoffDiff-uniform (c) WyckoffDiff-zeros. Q10, Q50,and Q90 are the 10th, 50th, and 90th percentiles respectively.