Table of Contents
Fetching ...

AceWGS: An LLM-Aided Framework to Accelerate Catalyst Design for Water-Gas Shift Reactions

Joyjit Chattoraj, Brahim Hamadicharef, Teo Shi Chang, Yingzhi Zeng, Chee Kok Poh, Luwei Chen, Teck Leong Tan

TL;DR

The paper tackles the design of low-temperature Water-Gas Shift catalysts, noting that models trained solely on numerical data miss crucial synthesis-text information. It introduces AceWGS, an LLM-RAG framework that integrates querying, database extraction, article comprehension, and inverse modelling to fuse textual and numerical literature data. The architecture leverages open-source tools (Python, Tkinter, LangChain, Ollama) and a Switch module to coordinate tasks, including a theory-guided AI model with particle swarm optimization for inverse design. A case study demonstrates a Pt/Au on α-MoC catalyst achieving CO conversion around 95% at 200 $^{ angle}$C, illustrating accelerated cross-disciplinary catalyst design for the WGS reaction $CO + H_2O \rightarrow CO_2 + H_2$.

Abstract

While the Water-Gas Shift (WGS) reaction plays a crucial role in hydrogen production for fuel cells, finding suitable catalysts to achieve high yields for low-temperature WGS reactions remains a persistent challenge. Artificial Intelligence (AI) has shown promise in accelerating catalyst design by exploring vast candidate spaces, however, two key gaps limit its effectiveness. First, AI models primarily train on numerical data, which fail to capture essential text-based information, such as catalyst synthesis methods. Second, the cross-disciplinary nature of catalyst design requires seamless collaboration between AI, theory, experiments, and numerical simulations, often leading to communication barriers. To address these gaps, we present AceWGS, a Large Language Models (LLMs)-aided framework to streamline WGS catalyst design. AceWGS interacts with researchers through natural language, answering queries based on four features: (i) answering general queries, (ii) extracting information about the database comprising WGS-related journal articles, (iii) comprehending the context described in these articles, and (iv) identifying catalyst candidates using our proposed AI inverse model. We presented a practical case study demonstrating how AceWGS can accelerate the catalyst design process. AceWGS, built with open-source tools, offers an adjustable framework that researchers can readily adapt for a range of AI-accelerated catalyst design applications, supporting seamless integration across cross-disciplinary studies.

AceWGS: An LLM-Aided Framework to Accelerate Catalyst Design for Water-Gas Shift Reactions

TL;DR

The paper tackles the design of low-temperature Water-Gas Shift catalysts, noting that models trained solely on numerical data miss crucial synthesis-text information. It introduces AceWGS, an LLM-RAG framework that integrates querying, database extraction, article comprehension, and inverse modelling to fuse textual and numerical literature data. The architecture leverages open-source tools (Python, Tkinter, LangChain, Ollama) and a Switch module to coordinate tasks, including a theory-guided AI model with particle swarm optimization for inverse design. A case study demonstrates a Pt/Au on α-MoC catalyst achieving CO conversion around 95% at 200 C, illustrating accelerated cross-disciplinary catalyst design for the WGS reaction .

Abstract

While the Water-Gas Shift (WGS) reaction plays a crucial role in hydrogen production for fuel cells, finding suitable catalysts to achieve high yields for low-temperature WGS reactions remains a persistent challenge. Artificial Intelligence (AI) has shown promise in accelerating catalyst design by exploring vast candidate spaces, however, two key gaps limit its effectiveness. First, AI models primarily train on numerical data, which fail to capture essential text-based information, such as catalyst synthesis methods. Second, the cross-disciplinary nature of catalyst design requires seamless collaboration between AI, theory, experiments, and numerical simulations, often leading to communication barriers. To address these gaps, we present AceWGS, a Large Language Models (LLMs)-aided framework to streamline WGS catalyst design. AceWGS interacts with researchers through natural language, answering queries based on four features: (i) answering general queries, (ii) extracting information about the database comprising WGS-related journal articles, (iii) comprehending the context described in these articles, and (iv) identifying catalyst candidates using our proposed AI inverse model. We presented a practical case study demonstrating how AceWGS can accelerate the catalyst design process. AceWGS, built with open-source tools, offers an adjustable framework that researchers can readily adapt for a range of AI-accelerated catalyst design applications, supporting seamless integration across cross-disciplinary studies.

Paper Structure

This paper contains 10 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: AceWGS framework utilizing a graphical user interface (GUI), large language models (LLMs) with retrieval augmented generation (RAG) to accelerate catalyst design for Water-Gas Shift Reactions (WGS).
  • Figure 2: (a) Feature-1 contains an LLM model to answer general queries. (b) A typical answer generated by Feature-1 is based on a catalyst-related question. (c) Similarly, a typical answer generated by Feature-1 is based on an AI-related question.
  • Figure 3: (a) Feature-2 consists of two tools: (i) an LLM agent that takes a data frame and a customized prompt as inputs, and (ii) an execution tool that runs the Python command suggested by the LLM agent. (b) Typical responses generated by Feature-2 in answer to two questions designed to retrieve information from the local database.
  • Figure 4: (a) Feature-3 is an LLM-RAG that takes a vector retriever tool and a customized prompt as inputs. (b) The workflow of the vector retriever tool. (c) Typical responses generated by Feature-3 in answer to four questions set to comprehend a research article.
  • Figure 5: (a) Feature-4 contains three tools: (i) the Parameter Settings tool, a GUI where researchers can set the required catalyst design parameters, e.g., base metals, supports, promoters, preparation methods, and reaction conditions; (ii) the Inverse Model, which searches for the best catalytic candidates based on the set design parameters; and (iii) the Prompt-guided LLM, which takes the outputs of the Inverse Model and explains them in a natural language manner. (b) The Inverse Model framework, where the model initially takes a set of parameters, predicts the CO conversion percentage using our theory-guided AI model then performs particle swarm optimization to search for catalytic candidates. (c) A typical query on inverse modelling results in a sequence of responses starting from setting parameters, stating the status of inverse modelling, and finally displaying the solution.