VAE for Modified 1-Hot Generative Materials Modeling, A Step Towards Inverse Material Design
Khalid El-Awady
TL;DR
The paper tackles inverse materials design by enforcing material viability through a modified $1$-hot representation that preserves decomposition. It develops a variational autoencoder with latent dimension $n=10$ trained on the Materials Project dataset of length-$89$ vectors, using thresholding at $T=0.04$ and post-processing to produce discrete formulas, guided by the negative ELBO objective. Results show the latent space largely preserves decomposition properties (high cosine similarity between observed and reconstructed component vectors) and that the generated materials match the data's element prevalence with modest KL divergence (≈$0.08$). This approach enables sequential inverse design by enabling RL policies that operate in a latent space where compositional changes map to linear latent manipulations, potentially improving viability constraints during material discovery.
Abstract
We investigate the construction of generative models capable of encoding physical constraints that can be hard to express explicitly. For the problem of inverse material design, where one seeks to design a material with a prescribed set of properties, a significant challenge is ensuring synthetic viability of a proposed new material. We encode an implicit dataset relationships, namely that certain materials can be decomposed into other ones in the dataset, and present a VAE model capable of preserving this property in the latent space and generating new samples with the same. This is particularly useful in sequential inverse material design, an emergent research area that seeks to design a material with specific properties by sequentially adding (or removing) elements using policies trained through deep reinforcement learning.
