Table of Contents
Fetching ...

Accelerating Manufacturing Scale-Up from Material Discovery Using Agentic Web Navigation and Retrieval-Augmented AI for Process Engineering Schematics Design

Sakhinana Sagar Srinivas, Akash Das, Shivam Gupta, Venkataramana Runkana

TL;DR

The paper tackles the bottleneck of translating material discoveries into scalable industrial production by automating the generation of Process Flow Diagrams and Process and Instrumentation Diagrams with an autonomous agentic framework. It combines agentic web navigation for multimodal data gathering with Graph Retrieval-Augmented Generation to structure knowledge into ontological graphs that support regulation-compliant diagram autogeneration and open-domain QA. Empirical results on a large chemical dataset demonstrate the framework's ability to produce accurate PFDs/PIDs with limited expert input and to perform robust multi-hop reasoning for complex queries. Overall, the work offers a practical pathway to accelerate industrial deployment of novel materials through context-aware, automated process design tooling.

Abstract

Process Flow Diagrams (PFDs) and Process and Instrumentation Diagrams (PIDs) are critical tools for industrial process design, control, and safety. However, the generation of precise and regulation-compliant diagrams remains a significant challenge, particularly in scaling breakthroughs from material discovery to industrial production in an era of automation and digitalization. This paper introduces an autonomous agentic framework to address these challenges through a twostage approach involving knowledge acquisition and generation. The framework integrates specialized sub-agents for retrieving and synthesizing multimodal data from publicly available online sources and constructs ontological knowledge graphs using a Graph Retrieval-Augmented Generation (Graph RAG) paradigm. These capabilities enable the automation of diagram generation and open-domain question answering (ODQA) tasks with high contextual accuracy. Extensive empirical experiments demonstrate the frameworks ability to deliver regulation-compliant diagrams with minimal expert intervention, highlighting its practical utility for industrial applications.

Accelerating Manufacturing Scale-Up from Material Discovery Using Agentic Web Navigation and Retrieval-Augmented AI for Process Engineering Schematics Design

TL;DR

The paper tackles the bottleneck of translating material discoveries into scalable industrial production by automating the generation of Process Flow Diagrams and Process and Instrumentation Diagrams with an autonomous agentic framework. It combines agentic web navigation for multimodal data gathering with Graph Retrieval-Augmented Generation to structure knowledge into ontological graphs that support regulation-compliant diagram autogeneration and open-domain QA. Empirical results on a large chemical dataset demonstrate the framework's ability to produce accurate PFDs/PIDs with limited expert input and to perform robust multi-hop reasoning for complex queries. Overall, the work offers a practical pathway to accelerate industrial deployment of novel materials through context-aware, automated process design tooling.

Abstract

Process Flow Diagrams (PFDs) and Process and Instrumentation Diagrams (PIDs) are critical tools for industrial process design, control, and safety. However, the generation of precise and regulation-compliant diagrams remains a significant challenge, particularly in scaling breakthroughs from material discovery to industrial production in an era of automation and digitalization. This paper introduces an autonomous agentic framework to address these challenges through a twostage approach involving knowledge acquisition and generation. The framework integrates specialized sub-agents for retrieving and synthesizing multimodal data from publicly available online sources and constructs ontological knowledge graphs using a Graph Retrieval-Augmented Generation (Graph RAG) paradigm. These capabilities enable the automation of diagram generation and open-domain question answering (ODQA) tasks with high contextual accuracy. Extensive empirical experiments demonstrate the frameworks ability to deliver regulation-compliant diagrams with minimal expert intervention, highlighting its practical utility for industrial applications.

Paper Structure

This paper contains 12 sections, 17 equations, 15 figures, 2 algorithms.

Figures (15)

  • Figure 1: The figure shows the PFD for the production of lithium hydroxide (LiOH) from lithium minerals halim2022potential. Lithium hydroxide is primarily used in lithium-ion battery production and high-purity chemical processes critical to semiconductor and electronics manufacturing, with a global market valued at $10 billion in 2024 and projected to grow significantly.
  • Figure 2: The figure shows the PID for the operation, maintenance, and troubleshooting of the atmospheric distillation unit sotelo2017design, which separates crude oil into gasoline, naphtha, diesel, and gas oil—essential for fuel production. The global refining market was valued at approximately $2 trillion in 2024, with projections to grow further.
  • Figure 3: We present a two-step approach to extract, aggregate, and structure knowledge for PFD and PID generation. In the first step, the autonomous agentic web navigation framework retrieves and synthesizes information from multiple online sources, enabling knowledge generation through chemical-specific web data aggregation. The second step involves two sub-steps: (a) extracting entity-relationship triples from the generated knowledge and populating an ontological knowledge graph, and (b) utilizing a Graph RAG framework to enable structured knowledge graph traversal and retrieval for complex ODQA tasks.
  • Figure 4: The LLM-based multi-agent framework for knowledge generation retrieves and synthesizes information from diverse online sources on PFDs and PIDs for well-known chemicals in large-scale industrial synthesis. A meta-agent orchestrates specialized sub-agents that retrieve and analyze data sources such as images, scholarly articles, patents, and Wikipedia. Each sub-agent, dedicated to a specific retrieval task, uses tools like SerpAPI for accessing search results and LLMs (e.g., OpenAI GPT-4o) for synthesizing information. An iterative feedback mechanism involving human experts and advanced Gold language models ensures the accuracy and quality of the generated knowledge. This structured, feedback-driven approach optimizes knowledge generation in specialized domains like chemical process design and control. Ultimately, the framework constructs an ontological knowledge graph within the Graph RAG framework, enhancing context-aware retrieval for both complex ODQA tasks related to chemical process diagrams and the autogeneration of PFDs and PIDs for novel chemicals.
  • Figure 5: The figure presents the evaluation results of the agentic web navigation framework in generating PFD and PID knowledge for chemical processes, benchmarked using NVIDIA Nemotron-4-340B-Reward model and scored on a scale from 0 to 4, where 0 represents the lowest and 4 the highest performance.
  • ...and 10 more figures