CrystalFormer-RL: Reinforcement Fine-Tuning for Materials Design

Zhendong Cao; Lei Wang

CrystalFormer-RL: Reinforcement Fine-Tuning for Materials Design

Zhendong Cao, Lei Wang

TL;DR

CrystalFormer-RL addresses the challenge of designing crystalline materials with multiple, potentially conflicting properties by combining a crystal-generative model with discriminative reward signals. The authors implement reinforcement fine-tuning, inspired by RLHF, using MLIP and property predictors as surrogate rewards to steer CrystalFormer toward stable structures and targeted figures of merit. They demonstrate substantial gains in stability (higher fraction of materials with $E_{\mathrm{hull}}<0.1$ eV/atom) and in property-guided discovery (high FoM materials with large $E_g$ and $\varepsilon_{\mathrm{elec}}$), while preserving computational efficiency. The work illustrates a flexible, plug-and-play framework for material design that leverages existing discriminative models to guide generative search and material retrieval.

Abstract

Reinforcement fine-tuning played an instrumental role in enhancing the instruction-following and reasoning abilities of large language models. In this work, we employ reinforcement fine-tuning for materials design, in which discriminative machine learning models are used to provide rewards to the autoregressive transformer-based materials generative model CrystalFormer. By optimizing the reward signals-such as energy above the convex hull and material properties figures of merit-reinforcement fine-tuning infuses knowledge from discriminative models into generative models. The resulting model, CrystalFormer-RL, shows enhanced stability in generated crystals and successfully discovers crystals with desirable yet conflicting material properties, such as substantial dielectric constant and band gap simultaneously. Notably, we observe that reinforcement fine-tuning not only enables the property-guided material design but also unlocks property-based material retrieval behavior of pretrained generative model. The present framework opens an exciting gateway to the synergies of the machine learning ecosystem for materials design.

CrystalFormer-RL: Reinforcement Fine-Tuning for Materials Design

TL;DR

Abstract

CrystalFormer-RL: Reinforcement Fine-Tuning for Materials Design

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)