Table of Contents
Fetching ...

Accelerating inverse materials design using generative diffusion models with reinforcement learning

Junwu Chen, Jeff Guo, Edvin Fako, Philippe Schwaller

TL;DR

This work introduces MatInvent, an RL-driven framework that fine-tunes pre-trained diffusion models to perform goal-directed crystal generation with multiple, potentially conflicting, property objectives. By modeling denoising steps as a multi-step MDP and applying reward-weighted KL regularization together with experience replay and a diversity filter, MatInvent delivers fast convergence (around 60 iterations) and substantial reductions in property evaluation needs compared with conditional generation approaches. The approach achieves both single-property and multi-property optimization across electronic, magnetic, mechanical, thermal, and dielectric domains, producing SUN structures with target-like properties under tight computational budgets and enabling Pareto optimization for challenging dielectric/magnet design tasks. The method demonstrates broad compatibility with diffusion architectures and suggests promising directions for uncertainty-aware predictors, curriculum-based multi-objective strategies, and integration with automated laboratories for closed-loop material discovery.

Abstract

Diffusion models promise to accelerate material design by directly generating novel structures with desired properties, but existing approaches typically require expensive and substantial labeled data ($>$10,000) and lack adaptability. Here we present MatInvent, a general and efficient reinforcement learning workflow that optimizes diffusion models for goal-directed crystal generation. For single-objective designs, MatInvent rapidly converges to target values within 60 iterations ($\sim$ 1,000 property evaluations) across electronic, magnetic, mechanical, thermal, and physicochemical properties. Furthermore, MatInvent achieves robust optimization in design tasks with multiple conflicting properties, successfully proposing low-supply-chain-risk magnets and high-$κ$ dielectrics. Compared to state-of-the-art methods, MatInvent exhibits superior generation performance under specified property constraints while dramatically reducing the demand for property computation by up to 378-fold. Compatible with diverse diffusion model architectures and property constraints, MatInvent could offer broad applicability in materials discovery.

Accelerating inverse materials design using generative diffusion models with reinforcement learning

TL;DR

This work introduces MatInvent, an RL-driven framework that fine-tunes pre-trained diffusion models to perform goal-directed crystal generation with multiple, potentially conflicting, property objectives. By modeling denoising steps as a multi-step MDP and applying reward-weighted KL regularization together with experience replay and a diversity filter, MatInvent delivers fast convergence (around 60 iterations) and substantial reductions in property evaluation needs compared with conditional generation approaches. The approach achieves both single-property and multi-property optimization across electronic, magnetic, mechanical, thermal, and dielectric domains, producing SUN structures with target-like properties under tight computational budgets and enabling Pareto optimization for challenging dielectric/magnet design tasks. The method demonstrates broad compatibility with diffusion architectures and suggests promising directions for uncertainty-aware predictors, curriculum-based multi-objective strategies, and integration with automated laboratories for closed-loop material discovery.

Abstract

Diffusion models promise to accelerate material design by directly generating novel structures with desired properties, but existing approaches typically require expensive and substantial labeled data (10,000) and lack adaptability. Here we present MatInvent, a general and efficient reinforcement learning workflow that optimizes diffusion models for goal-directed crystal generation. For single-objective designs, MatInvent rapidly converges to target values within 60 iterations ( 1,000 property evaluations) across electronic, magnetic, mechanical, thermal, and physicochemical properties. Furthermore, MatInvent achieves robust optimization in design tasks with multiple conflicting properties, successfully proposing low-supply-chain-risk magnets and high- dielectrics. Compared to state-of-the-art methods, MatInvent exhibits superior generation performance under specified property constraints while dramatically reducing the demand for property computation by up to 378-fold. Compatible with diverse diffusion model architectures and property constraints, MatInvent could offer broad applicability in materials discovery.

Paper Structure

This paper contains 59 sections, 48 equations, 19 figures, 1 table.

Figures (19)

  • Figure 1: MatInvent workflow for goal-directed material generation. (a) The schematic overview of MatInvent methodology. In each reinforcement learning (RL) iteration, the diffusion model acts as the RL agent to generate a batch of 3D crystal structures, which are subsequently geometrically optimized using machine learning potentials. Only valid, Stable, Unique, and Novel (SUN) structures are retained after filtering, proceeding to target property evaluation and reward assignment. High-reward samples are then used to fine-tune the diffusion model by policy optimization with reward-weighted Kullback–Leibler (KL) regularization, aided by experience replay and diversity filter to enhance sample efficiency and diversity. (b) The impact of geometry optimization (opt) and SUN filtering before property evaluation on the SUN ratio of generated structures during the RL process targeting a density of 18.0 g/cm$^3$. (c) The effect of experience replay on the optimization efficiency of RL process targeting a density of 18.0 g/cm$^3$. (d) The role of diversity filter in the composition diversity of generated structures during the RL process with a target density of 18.0 g/cm$^3$.
  • Figure 2: MatInvent performance on single property optimization. The optimization curves (left) for reinforcement learning (RL) and visualizations of some generated crystal structures (right) on different inverse design tasks with a single target property: (a) band gap equal to 3.0 eV; (b) magnetic density higher than 0.2 $\text{\AA}^{-3}$; (c) specific heat capacity exceeding 1.5 J/g/K; (d) minimal co-incident area (MCIA) below 80 $\text{\AA}^{2}$ on the Si(100) substrate; (e) bulk modulus of 300 GPa; (f) total dielectric constants exceeding 80; (g) synthesizability score higher than 0.9; and (h) Herfindahl–Hirschman index (HHI) score below 1250. Ten repeat experiments were performed for tasks c–h, while three for tasks a and b. The curves show the mean of repeated experiments while the shading represents standard deviation.
  • Figure 3: Comparison between conditional generation and reinforcement learning. (a) Number of DFT-labeled data used for model fine-tuning in the MatInvent workflow and conditional generation of MatterGen across two inverse design tasks. (b) SUN ratios of generated structures from MatterGen conditional generation and RL-finetuned diffusion model following the MatInvent workflow. Probability density distributions of property values of SUN structures generated by RL-finetuned diffusion models and MatterGen's conditional generation, respectively, for inverse design targets of (c) magnetic density higher than 0.2 $\text{\AA}^{-3}$ and (d) band gap of 3.0 eV. Number of SUN structures satisfying property requirements discovered by MatterGen conditional generation and RL-finetuned diffusion models within 250 DFT property calculations, for targets of (e) magnetic density higher than 0.2 $\text{\AA}^{-3}$ and (f) band gap of $3 \pm 0.1$ eV.
  • Figure 4: Designing permanent magnets with low supply chain risk. (a) Property distribution of SUN structures generated during the initial (0–20 loops) and final (100–120 loops) stages of RL process. (b) Mean values of target properties of SUN structures generated in each RL iteration. (c) Amount of DFT-labeled data used for model fine-tuning (left) and SUN ratios of generated structures (right) for MatterGen conditional generation and MatInvent workflow. (d) Number of SUN structures satisfying property requirements found by MatterGen conditional generation and RL-finetuned diffusion models within 200 DFT property calculations, for targets with magnetic density above 0.2 $\text{\AA}^{-3}$ and HHI score below 1500. (e) Visualizations of some SUN structures generated by RL-finetuned diffusion models, along with their chemical formula, space group, energy above hull ($E_{hull}$), magnetic density, HHI score, and synthesizability score.
  • Figure 5: Designing novel high-$\kappa$ dielectrics. (a) Property distribution of SUN structures generated during the initial (0–20 loops) and final (220–240 loops) stages of RL process. (b) Evolution of Pareto fronts across RL iterations for two conflicting material properties: dielectric constant and band gap. (c) DFT-calculated property distribution of SUN structures generated by the RL-finetuned diffusion model, which were ranked and selected based on ML predictions. (d) Number of SUN structures satisfying property requirements found by RL-finetuned diffusion models within 200 DFT property calculations, for objectives of band gap ($E_g$) exceeding 3.0 eV, total dielectric constant ($\varepsilon_{\text{total }}$) surpassing 30, and figure of merit (FoM) higher than 210. (e) Distribution of DFT-computed figure of merit for generated structures by the pre-trained and RL-finetuned diffusion models. (f) Visualizations of some SUN structures generated by RL-finetuned diffusion models, along with their chemical formula, space group, energy above hull ($E_{hull}$), synthesizability score, $E_g$, $\varepsilon_{\text{total }}$, and FoM.
  • ...and 14 more figures