Table of Contents
Fetching ...

dZiner: Rational Inverse Design of Materials with AI Agents

Mehrad Ansari, Jeffrey Watchorn, Carla E. Brown, Joseph S. Brown

TL;DR

The paper presents dZiner, an LLM-powered chemist agent that fuses literature-derived design guidelines with domain-expert surrogates to perform rational inverse design across diverse material targets. By supporting closed-loop and human-in-the-loop workflows, dZiner can propose, evaluate, and refine candidates with feasibility and uncertainty considerations, using task-specific surrogates for CMC, docking, and MOF CO2 uptake. Across surfactants, WDR5 ligands, and MOF linkers, domain-guided design yields substantial performance gains and higher synthesizability compared with literature-free baselines, demonstrated with open-source code. This approach offers a flexible, scalable pathway to accelerate targeted material design and discovery with human-AI collaboration.

Abstract

Recent breakthroughs in machine learning and artificial intelligence, fueled by scientific data, are revolutionizing the discovery of new materials. Despite the wealth of existing scientific literature, the availability of both structured experimental data and chemical domain knowledge that can be easily integrated into data-driven workflows is limited. The motivation to integrate this information, as well as additional context from first-principle calculations and physics-informed deep learning surrogate models, is to enable efficient exploration of the relevant chemical space and to predict structure-property relationships of new materials a priori. Ultimately, such a framework could replicate the expertise of human subject-matter experts. In this work, we present dZiner, a chemist AI agent, powered by large language models (LLMs), that discovers new compounds with desired properties via inverse design (property-to-structure). In specific, the agent leverages domain-specific insights from foundational scientific literature to propose new materials with enhanced chemical properties, iteratively evaluating them using relevant surrogate models in a rational design process, while accounting for design constraints. The model supports both closed-loop and human-in-the-loop feedback cycles, enabling human-AI collaboration in molecular design with real-time property inference, and uncertainty and chemical feasibility assessment. We demonstrate the flexibility of this agent by applying it to various materials target properties, including surfactants, ligand and drug candidates, and metal-organic frameworks. Our approach holds promise to both accelerate the discovery of new materials and enable the targeted design of materials with desired functionalities. The methodology is available as an open-source software on https://github.com/mehradans92/dZiner.

dZiner: Rational Inverse Design of Materials with AI Agents

TL;DR

The paper presents dZiner, an LLM-powered chemist agent that fuses literature-derived design guidelines with domain-expert surrogates to perform rational inverse design across diverse material targets. By supporting closed-loop and human-in-the-loop workflows, dZiner can propose, evaluate, and refine candidates with feasibility and uncertainty considerations, using task-specific surrogates for CMC, docking, and MOF CO2 uptake. Across surfactants, WDR5 ligands, and MOF linkers, domain-guided design yields substantial performance gains and higher synthesizability compared with literature-free baselines, demonstrated with open-source code. This approach offers a flexible, scalable pathway to accelerate targeted material design and discovery with human-AI collaboration.

Abstract

Recent breakthroughs in machine learning and artificial intelligence, fueled by scientific data, are revolutionizing the discovery of new materials. Despite the wealth of existing scientific literature, the availability of both structured experimental data and chemical domain knowledge that can be easily integrated into data-driven workflows is limited. The motivation to integrate this information, as well as additional context from first-principle calculations and physics-informed deep learning surrogate models, is to enable efficient exploration of the relevant chemical space and to predict structure-property relationships of new materials a priori. Ultimately, such a framework could replicate the expertise of human subject-matter experts. In this work, we present dZiner, a chemist AI agent, powered by large language models (LLMs), that discovers new compounds with desired properties via inverse design (property-to-structure). In specific, the agent leverages domain-specific insights from foundational scientific literature to propose new materials with enhanced chemical properties, iteratively evaluating them using relevant surrogate models in a rational design process, while accounting for design constraints. The model supports both closed-loop and human-in-the-loop feedback cycles, enabling human-AI collaboration in molecular design with real-time property inference, and uncertainty and chemical feasibility assessment. We demonstrate the flexibility of this agent by applying it to various materials target properties, including surfactants, ligand and drug candidates, and metal-organic frameworks. Our approach holds promise to both accelerate the discovery of new materials and enable the targeted design of materials with desired functionalities. The methodology is available as an open-source software on https://github.com/mehradans92/dZiner.
Paper Structure (16 sections, 3 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 16 sections, 3 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: dZiner workflow overview. The model starts by inputting the material's initial structure as a textual representation. The AI agent dynamically retrieves domain-knowledge (design guidelines) for Property X from scientific literature, the Internet or other resources. Based on these guidelines, and any additional design constraints provided in natural language, the agent proposes a new candidate and assesses its chemical feasibility in real-time. Next, it estimates Property X for the new candidate, incorporating epistemic uncertainty, using a cost-efficient surrogate model. Optionally, as part of a human-in-the-loop process, the human chemist can review the agent's new candidates and chain-of-thoughts, providing feedback and suggesting further modifications or constraints, creating an opportunity for human-AI collaboration to guide the exploration process. The agent continues exploring the chemical space, guided by chemistry-informed rules, until it meets the convergence criteria.
  • Figure 2: dZiner's chain-of-thoughts in the closed-loop inverse design of surfactants with lower CMC. The agent is powered by Claude 3.5 Sonnet. The design guidelines are retrieved from references czajka2015surfactantsgaudin2016newmozrzymas2011predictionhuibers1997predictionli2004estimationxuefeng2006correlationmoriarty2023analyzingboukelkal2024qspr, and the model is asked to keep the molecular weight lower than 600 (g/mol) in natural language text. CMC is reduced by two orders of magnitude via iterative agent-suggested chemical modifications.
  • Figure 3: dZiner's chain-of-thoughts in the closed-loop inverse design of a drug candidate against WDR5 protein target. The design guidelines are extracted by the agent from references chen2018designteuscher2023structure, and the model is asked to keep the molecular weight lower than 600 (g/mol) in natural language text. Docking score is reduced by just over two orders of magnitude via iterative agent-suggested chemical modifications (Dock Score $\propto$ log(kcal/mol)).
  • Figure 4: Human-in-the-loop inverse design of a drug candidate against WDR5 protein target. The agent is powered by Claude 3.5 Sonnet. dZiner found to be highly cooperative, interpretable, and able to enact changes requested with ease in this human-AI collaboration.
  • Figure 5: dZiner's chain-of-thoughts in the closed-loop inverse design of organic linkers for MOFs with high CO$_2$ adsorption capacity. The agent is powered by Claude 3.5 Sonnet. Design guidelines were retrieved from references usman2022advancedtrickett2017chemistrymatthew2017rationalparveen2024designing. CO$_2$ adsorption capacity is improved by 85% via iterative agent-suggested chemical modifications, while following additional design constraints.