Table of Contents
Fetching ...

Flowsheet synthesis through hierarchical reinforcement learning and graph neural networks

Laura Stops, Roel Leenhouts, Qinghe Gao, Artur M. Schweidtmann

TL;DR

This work tackles automated flowsheet synthesis by framing it as a graph-based reinforcement learning problem. It introduces a hierarchical, hybrid actor-critic RL agent that processes flowsheet graphs with graph neural networks to produce unit operations and continuous design variables, enabling discrete, continuous, and combined decision-making. The method is demonstrated on a methyl acetate production case, showing rapid learning in discrete, continuous, and hybrid spaces and producing economically viable flowsheets with interpretable structure. The approach lays a foundation for scaling to larger action-state spaces and integration with process simulators, with future work focusing on reward shaping and solver integration to handle more complex flowsheets.

Abstract

Process synthesis experiences a disruptive transformation accelerated by digitization and artificial intelligence. We propose a reinforcement learning algorithm for chemical process design based on a state-of-the-art actor-critic logic. Our proposed algorithm represents chemical processes as graphs and uses graph convolutional neural networks to learn from process graphs. In particular, the graph neural networks are implemented within the agent architecture to process the states and make decisions. Moreover, we implement a hierarchical and hybrid decision-making process to generate flowsheets, where unit operations are placed iteratively as discrete decisions and corresponding design variables are selected as continuous decisions. We demonstrate the potential of our method to design economically viable flowsheets in an illustrative case study comprising equilibrium reactions, azeotropic separation, and recycles. The results show quick learning in discrete, continuous, and hybrid action spaces. Due to the flexible architecture of the proposed reinforcement learning agent, the method is predestined to include large action-state spaces and an interface to process simulators in future research.

Flowsheet synthesis through hierarchical reinforcement learning and graph neural networks

TL;DR

This work tackles automated flowsheet synthesis by framing it as a graph-based reinforcement learning problem. It introduces a hierarchical, hybrid actor-critic RL agent that processes flowsheet graphs with graph neural networks to produce unit operations and continuous design variables, enabling discrete, continuous, and combined decision-making. The method is demonstrated on a methyl acetate production case, showing rapid learning in discrete, continuous, and hybrid spaces and producing economically viable flowsheets with interpretable structure. The approach lays a foundation for scaling to larger action-state spaces and integration with process simulators, with future work focusing on reward shaping and solver integration to handle more complex flowsheets.

Abstract

Process synthesis experiences a disruptive transformation accelerated by digitization and artificial intelligence. We propose a reinforcement learning algorithm for chemical process design based on a state-of-the-art actor-critic logic. Our proposed algorithm represents chemical processes as graphs and uses graph convolutional neural networks to learn from process graphs. In particular, the graph neural networks are implemented within the agent architecture to process the states and make decisions. Moreover, we implement a hierarchical and hybrid decision-making process to generate flowsheets, where unit operations are placed iteratively as discrete decisions and corresponding design variables are selected as continuous decisions. We demonstrate the potential of our method to design economically viable flowsheets in an illustrative case study comprising equilibrium reactions, azeotropic separation, and recycles. The results show quick learning in discrete, continuous, and hybrid action spaces. Due to the flexible architecture of the proposed reinforcement learning agent, the method is predestined to include large action-state spaces and an interface to process simulators in future research.
Paper Structure (22 sections, 2 equations, 13 figures, 4 tables, 1 algorithm)

This paper contains 22 sections, 2 equations, 13 figures, 4 tables, 1 algorithm.

Figures (13)

  • Figure 1: Agent-environment interaction in an actor-critic policy optimization approach for flowsheet synthesis. The agent approximates the policy and makes decisions. Meanwhile, the critic estimates the value of the environment's state using the flowsheet graph, which is used to evaluate the agent's decisions. Here, actor and critic both deploy graph convolutional neural networks.
  • Figure 2: Example of a flowsheet displayed as a graph. Unit operations, feeds, and products are represented as nodes, whereas streams are represented as edges.
  • Figure 3: Hierarchical decision levels of the agent, starting from an intermediate flowsheet. In the first level, the agent selects a location where the flowsheet will be extended. Possible locations are open streams, represented by "undefined" nodes. In the presented flowsheet, both streams leaving the column can be chosen. Then, the agent selects a unit operation. Thereby, the options are to add a heat exchanger, a reactor, a column, a recycle or to sell the stream as a product. Finally, a continuous design variable is selected for each unit operation. This third decision depends on which unit operation was selected previously.
  • Figure 4: Flowsheet fingerprint generation derived from Schweidtmann et al. Schweidtmann.2020. The flowsheet graph is processed through an MPNN, using GCNs to perform message passing and update node embeddings. In the readout step, a pooling function is applied, resulting in a vector format, the flowsheet fingerprint.
  • Figure 5: Update of the node embeddings during the message passing phase in a graph convolutional layer. The considered node is marked in blue and its neighbors in yellow. First, the information stored in the neighboring nodes and the respective edges is processed and combined through a message function M. Then, a message is generated to update the information embedded in the considered node through the update function U. The approach and its illustration follow a method proposed by Schweidtmann et al. Schweidtmann.2020.
  • ...and 8 more figures