Crystal-GFN: sampling crystals with desirable properties and constraints
Mila AI4Science, Alex Hernandez-Garcia, Alexandre Duval, Alexandra Volokhova, Yoshua Bengio, Divya Sharma, Pierre Luc Carrier, Yasmine Benabed, Michał Koziarski, Victor Schmidt
TL;DR
Addressing the challenge of discovering stable inorganic crystals, the paper presents Crystal-GFN, a generative model that samples space group, composition, and lattice parameters sequentially under hard domain constraints. It uses a proxy formation energy model trained on MatBench as the reward to train a GFlowNet, enabling diverse sampling of low-energy crystals. In experiments, Crystal-GFN generated 10k crystals with a median predicted FE of -3.1 eV/atom and 95% below -2, while covering a broad range of space groups, lattice systems, and elements. This approach demonstrates how domain-informed representations and flexible reward functions can accelerate materials discovery while maintaining structural validity.
Abstract
Accelerating material discovery holds the potential to greatly help mitigate the climate crisis. Discovering new solid-state materials such as electrocatalysts, super-ionic conductors or photovoltaic materials can have a crucial impact, for instance, in improving the efficiency of renewable energy production and storage. In this paper, we introduce Crystal-GFN, a generative model of crystal structures that sequentially samples structural properties of crystalline materials, namely the space group, composition and lattice parameters. This domain-inspired approach enables the flexible incorporation of physical and structural hard constraints, as well as the use of any available predictive model of a desired physicochemical property as an objective function. To design stable materials, one must target the candidates with the lowest formation energy. Here, we use as objective the formation energy per atom of a crystal structure predicted by a new proxy machine learning model trained on MatBench. The results demonstrate that Crystal-GFN is able to sample highly diverse crystals with low (median -3.1 eV/atom) predicted formation energy.
