Table of Contents
Fetching ...

Survey and Tutorial of Reinforcement Learning Methods in Process Systems Engineering

Maximilian Bloor, Max Mowbray, Ehecatl Antonio Del Rio Chanona, Calvin Tsay

TL;DR

The paper addresses the challenge of applying reinforcement learning to Process Systems Engineering by providing a tutorial-style synthesis of methods (model-free, model-based, constrained, offline, distributional, and goal-conditioned) and surveying PSE applications across batch control, regulation, scheduling, and supply chains. It formalizes RL within the MDP/POMDP framework, discusses exact DP limits, and then details a spectrum of RL techniques (from bandits to actor-critic, with emphasis on practicality for high-dimensional, safety-critical industrial settings). Key contributions include clarifying how to integrate RL with control theory (e.g., MPC, CMDP, barrier functions), highlighting sample efficiency and safety as central concerns, and illustrating successful or promising applications in batch processes, regulatory control, and supply chain management. The paper thus guides researchers and practitioners toward data-efficient, safe, and domain-informed RL deployments in PSE, and outlines future directions such as offline RL, model-based planning, distributional risk-aware objectives, and goal-conditioned policies to handle multi-setpoint operations.

Abstract

Sequential decision making under uncertainty is central to many Process Systems Engineering (PSE) challenges, where traditional methods often face limitations related to controlling and optimizing complex and stochastic systems. Reinforcement Learning (RL) offers a data-driven approach to derive control policies for such challenges. This paper presents a survey and tutorial on RL methods, tailored for the PSE community. We deliver a tutorial on RL, covering fundamental concepts and key algorithmic families including value-based, policy-based and actor-critic methods. Subsequently, we survey existing applications of these RL techniques across various PSE domains, such as in fed-batch and continuous process control, process optimization, and supply chains. We conclude with PSE focused discussion of specialized techniques and emerging directions. By synthesizing the current state of RL algorithm development and implications for PSE this work identifies successes, challenges, trends, and outlines avenues for future research at the interface of these fields.

Survey and Tutorial of Reinforcement Learning Methods in Process Systems Engineering

TL;DR

The paper addresses the challenge of applying reinforcement learning to Process Systems Engineering by providing a tutorial-style synthesis of methods (model-free, model-based, constrained, offline, distributional, and goal-conditioned) and surveying PSE applications across batch control, regulation, scheduling, and supply chains. It formalizes RL within the MDP/POMDP framework, discusses exact DP limits, and then details a spectrum of RL techniques (from bandits to actor-critic, with emphasis on practicality for high-dimensional, safety-critical industrial settings). Key contributions include clarifying how to integrate RL with control theory (e.g., MPC, CMDP, barrier functions), highlighting sample efficiency and safety as central concerns, and illustrating successful or promising applications in batch processes, regulatory control, and supply chain management. The paper thus guides researchers and practitioners toward data-efficient, safe, and domain-informed RL deployments in PSE, and outlines future directions such as offline RL, model-based planning, distributional risk-aware objectives, and goal-conditioned policies to handle multi-setpoint operations.

Abstract

Sequential decision making under uncertainty is central to many Process Systems Engineering (PSE) challenges, where traditional methods often face limitations related to controlling and optimizing complex and stochastic systems. Reinforcement Learning (RL) offers a data-driven approach to derive control policies for such challenges. This paper presents a survey and tutorial on RL methods, tailored for the PSE community. We deliver a tutorial on RL, covering fundamental concepts and key algorithmic families including value-based, policy-based and actor-critic methods. Subsequently, we survey existing applications of these RL techniques across various PSE domains, such as in fed-batch and continuous process control, process optimization, and supply chains. We conclude with PSE focused discussion of specialized techniques and emerging directions. By synthesizing the current state of RL algorithm development and implications for PSE this work identifies successes, challenges, trends, and outlines avenues for future research at the interface of these fields.

Paper Structure

This paper contains 52 sections, 46 equations, 13 figures, 1 table.

Figures (13)

  • Figure 1: Illustrative grid world environments.
  • Figure 2: Probabilistic transition in the process control gridworld
  • Figure 3: Illustration of applications of RL in Process Systems Engineering
  • Figure 4: Reinforcement Learning Topology
  • Figure 5: Directed probability graph for multi-armed bandits (left) and MDPs (right). The bandit probability graph shows the cost is only depenedent on the chosen arm where as the MDP graph shows the next state and cost incurred depends on both the control chosen, the current state and the next state.
  • ...and 8 more figures