MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow

Xiaoli Yan; Nathaniel Hudson; Hyun Park; Daniel Grzenda; J. Gregory Pauloski; Marcus Schwarting; Haochen Pan; Hassan Harb; Samuel Foreman; Chris Knight; Tom Gibbs; Kyle Chard; Santanu Chaudhuri; Emad Tajkhorshid; Ian Foster; Mohamad Moosavi; Logan Ward; E. A. Huerta

MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow

Xiaoli Yan, Nathaniel Hudson, Hyun Park, Daniel Grzenda, J. Gregory Pauloski, Marcus Schwarting, Haochen Pan, Hassan Harb, Samuel Foreman, Chris Knight, Tom Gibbs, Kyle Chard, Santanu Chaudhuri, Emad Tajkhorshid, Ian Foster, Mohamad Moosavi, Logan Ward, E. A. Huerta

TL;DR

MOFA presents an HPC-coupled Generative AI and simulation workflow for rapid MOF discovery targeted at carbon capture. It unifies GPU-accelerated GenAI with CPU/GPU-optimized atomistic screenings in an online learning loop to continuously improve linker generation and MOF quality. Across a 450-node run, MOFA generated over 100 MOFs per hour and identified multiple high-potential candidates, with retraining boosting both stability and adsorption performance. The modular design and open-source implementation enable adaptation to other materials domains and future exploration of adaptive learning and systems research in heterogeneous HPC environments.

Abstract

We present MOFA, an open-source generative AI (GenAI) plus simulation workflow for high-throughput generation of metal-organic frameworks (MOFs) on large-scale high-performance computing (HPC) systems. MOFA addresses key challenges in integrating GPU-accelerated computing for GPU-intensive GenAI tasks, including distributed training and inference, alongside CPU- and GPU-optimized tasks for screening and filtering AI-generated MOFs using molecular dynamics, density functional theory, and Monte Carlo simulations. These heterogeneous tasks are unified within an online learning framework that optimizes the utilization of available CPU and GPU resources across HPC systems. Performance metrics from a 450-node (14,400 AMD Zen 3 CPUs + 1800 NVIDIA A100 GPUs) supercomputer run demonstrate that MOFA achieves high-throughput generation of novel MOF structures, with CO$_2$ adsorption capacities ranking among the top 10 in the hypothetical MOF (hMOF) dataset. Furthermore, the production of high-quality MOFs exhibits a linear relationship with the number of nodes utilized. The modular architecture of MOFA will facilitate its integration into other scientific applications that dynamically combine GenAI with large-scale simulations.

MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow

TL;DR

Abstract

adsorption capacities ranking among the top 10 in the hypothetical MOF (hMOF) dataset. Furthermore, the production of high-quality MOFs exhibits a linear relationship with the number of nodes utilized. The modular architecture of MOFA will facilitate its integration into other scientific applications that dynamically combine GenAI with large-scale simulations.

Paper Structure (21 sections, 10 figures, 1 table)

This paper contains 21 sections, 10 figures, 1 table.

Introduction
Related Work
MOFs & Their Discovery
Heterogeneous Computing Workflows
MOFA Design
Abstract Formulation
Sequential MOF Generation
Workflow Policies
Executing MOFA
Policy Expression
Resource Allocation and Communication
Evaluation
Utilization of Heterogeneous Resources
Effect of Scale on Task Throughput
Ability to Find Stable MOFs
...and 6 more sections

Figures (10)

Figure 1: MOFA implements an online learning loop that refines a generative AI model, MOFLinker, using the MOFs it has generated. The initial steps in the workflow validate linker molecules produced by the generative model before using those that pass validation to assemble MOFs. New MOFs are placed in a LIFO queue, from which they are retrieved to be evaluated for stability, and the gas capacity of the most stable are further evaluated to refine the structures and estimate properties of interest. The structures and their computed properties are collected in a database and used to retrain the GenAI model. All steps run concurrently. Note: The width of the arrows for "Structures" corresponds with the amount of structures being passed between each pair of tasks in the workflow.
Figure 2: Task and resource allocation in the MOFA workflow. The top section shows the Colmena Thinker, containing seven agents (rounded-corner boxes), each corresponding to one of the seven tasks. The bottom section depicts five types of MOFA workers, each with a 32-core CPU and four GPUs, with distinct resource allocation schemata for different MOFA tasks.
Figure 3: Active time of compute nodes on Polaris, as measured by the average time each workflow worker spent processing work over one hour.
Figure 4: MOFA's utilization of Polaris compute nodes as fraction of peak varies with the code running.
Figure 5: Sustained throughput in tasks per hour for the four main workflow stages as a function of system scale. The dashed lines indicate ideal scaling computed from the rates at the smallest node count.
...and 5 more figures

MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow

TL;DR

Abstract

MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow

Authors

TL;DR

Abstract

Table of Contents

Figures (10)