Inverse Materials Design by Large Language Model-Assisted Generative Framework
Yun Hao, Che Fan, Beilin Ye, Wenhao Lu, Zhen Lu, Peilin Zhao, Zhifeng Gao, Qingyao Wu, Yanhui Liu, Tongqi Wen
TL;DR
This paper presents AlloyGAN, a closed-loop inverse materials-design framework that combines Large Language Model–assisted text mining with a conditional GAN to generate alloy compositions meeting target properties. By constructing a richly described dataset from literature and thermodynamic descriptors, the CGAN can produce property-driven compositions and iteratively validate them experimentally, achieving predictions within $8\%$ of measured values. Downstream tasks demonstrate high accuracy in GFA classification ($\approx$0.90 F1) and strong regression performance ($R^2$ up to $0.80$ for key properties), underscoring the framework's versatility. The approach highlights the value of integrating LLM-derived knowledge with generative modeling and experimental feedback to accelerate autonomous materials discovery, particularly for multi-component metallic glasses.
Abstract
Deep generative models hold great promise for inverse materials design, yet their efficiency and accuracy remain constrained by data scarcity and model architecture. Here, we introduce AlloyGAN, a closed-loop framework that integrates Large Language Model (LLM)-assisted text mining with Conditional Generative Adversarial Networks (CGANs) to enhance data diversity and improve inverse design. Taking alloy discovery as a case study, AlloyGAN systematically refines material candidates through iterative screening and experimental validation. For metallic glasses, the framework predicts thermodynamic properties with discrepancies of less than 8% from experiments, demonstrating its robustness. By bridging generative AI with domain knowledge and validation workflows, AlloyGAN offers a scalable approach to accelerate the discovery of materials with tailored properties, paving the way for broader applications in materials science.
