Table of Contents
Fetching ...

Offline Materials Optimization with CliqueFlowmer

Jakub Grudzien Kuba, Benjamin Kurt Miller, Sergey Levine, Pieter Abbeel

TL;DR

This work introduces a domain-specific model, dubbed CliqueFlowmer, that incorporates recent advances of clique-based MBO into transformer and flow generation and validate CliqueFlowmer's optimization abilities and shows that materials it produces strongly outperform those provided by generative baselines.

Abstract

Recent advances in deep learning inspired neural network-based approaches to computational materials discovery (CMD). A plethora of problems in this field involve finding materials that optimize a target property. Nevertheless, the increasingly popular generative modeling methods are ineffective at boldly exploring attractive regions of the materials space due to their maximum likelihood training. In this work, we offer an alternative CMD technique based on offline model-based optimization (MBO) that fuses direct optimization of a target material property into generation. To that end, we introduce a domain-specific model, dubbed CliqueFlowmer, that incorporates recent advances of clique-based MBO into transformer and flow generation. We validate CliqueFlowmer's optimization abilities and show that materials it produces strongly outperform those provided by generative baselines. To enable employment of CliqueFlowmer in specialized materials optimization problems and support interdisciplinary research, we open-source our code at https://github.com/znowu/CliqueFlowmer.

Offline Materials Optimization with CliqueFlowmer

TL;DR

This work introduces a domain-specific model, dubbed CliqueFlowmer, that incorporates recent advances of clique-based MBO into transformer and flow generation and validate CliqueFlowmer's optimization abilities and shows that materials it produces strongly outperform those provided by generative baselines.

Abstract

Recent advances in deep learning inspired neural network-based approaches to computational materials discovery (CMD). A plethora of problems in this field involve finding materials that optimize a target property. Nevertheless, the increasingly popular generative modeling methods are ineffective at boldly exploring attractive regions of the materials space due to their maximum likelihood training. In this work, we offer an alternative CMD technique based on offline model-based optimization (MBO) that fuses direct optimization of a target material property into generation. To that end, we introduce a domain-specific model, dubbed CliqueFlowmer, that incorporates recent advances of clique-based MBO into transformer and flow generation. We validate CliqueFlowmer's optimization abilities and show that materials it produces strongly outperform those provided by generative baselines. To enable employment of CliqueFlowmer in specialized materials optimization problems and support interdisciplinary research, we open-source our code at https://github.com/znowu/CliqueFlowmer.
Paper Structure (41 sections, 45 equations, 14 figures, 6 tables)

This paper contains 41 sections, 45 equations, 14 figures, 6 tables.

Figures (14)

  • Figure 1: The unit cell of a hypothetical material. The cell has a shape of a parallelepiped determined by three axes, $\vec{{\textnormal{a}}}$, $\vec{{\textnormal{b}}}$, and $\vec{{\textnormal{c}}}$. The angles between the axes are $\text{ang}(\vec{{\textnormal{b}}}, \vec{{\textnormal{c}}})=\alpha$, $\text{ang}(\vec{{\textnormal{c}}}, \vec{{\textnormal{a}}})=\beta$, $\text{ang}(\vec{{\textnormal{a}}}, \vec{{\textnormal{b}}})=\gamma$. In this cell, there are five atoms, whose type sequence is ${\mathbf{a}}=[\text{N, Cl, C, O, S}]$.
  • Figure 2: Computational materials discovery through MBO with CliqueFlowmer. Known materials are encoded, with a transformer encoder and an attention-based pooling layer, into a fixed-dimensional latent space. The latent variable admits a clique decomposition, with respect to the target property, where each clique contributes additively to the target property. The representations are optimized, with evolution strategies-based gradient optimization. Atom types are then decoded with an autoregressive transformer and beam search. The flow model reconstructs the material's geometry from the atom sequence and the latent representation by which it is conditioned via cross-attention.
  • Figure 3: Examples of materials optimized by CliqueFlowmer for band gap minimization. Each figure shows a material's unit cell, and the caption describes its composition. We can observe that the optimized materials often include Dysprosium (Dy) and Silicon (Si) as components. The corresponding band gap values are $0.06$, $0.01$, $0.14$, $0.02$, and $0.03$.
  • Figure 4: The distribution of the property value (band gap) among discovered materials. We compare CliqueFlowmer (blue) to materials from MP-20 dataset (green, left) and to those generated by MatterGen (purple, right). For all evaluations, we remove materials with $\Delta_{\text{band}}=0$ (our model excels at discovering them without loss of S.U.N. metrics---see Table \ref{['tab:eform_sun_comparison']} for results and Appendix \ref{['appendix:add-fig']}) for better interpretability. We visualize the property in the log scale. Materials optimized by CliqueFlowmer have values accumulated near zero while others are more evenly spread out.
  • Figure 5: Latent interpolation between two materials. We linearly interpolate $z^{(t)}=(1-t)z^{(0)}+t z^{(1)}$ between As$_3$Rh and MgInBr$_3$ and decode each $z^{(t)}$. The unit cells evolve smoothly in the cell shape, atom positions, and atom count.
  • ...and 9 more figures