Optimizing Operation Recipes with Reinforcement Learning for Safe and Interpretable Control of Chemical Processes

Dean Brandner; Sergio Lucia

Optimizing Operation Recipes with Reinforcement Learning for Safe and Interpretable Control of Chemical Processes

Dean Brandner, Sergio Lucia

TL;DR

The paper tackles safe, data-efficient optimization of chemical-process operation under hard constraints by introducing recipe-based reinforcement learning that optimizes both operation recipes and underlying linear PID controllers within an expert-structured framework. It defines a specialized RL environment where the policy selects the next recipe parameter $\Theta_c$ and updates the recipe state $\boldsymbol{s}_{ ext{R}}=[\boldsymbol{x},\boldsymbol{\Theta},c]$, with parameter updates $\boldsymbol{\Theta}^+ = \boldsymbol{\Theta} + \boldsymbol{i}_c \boldsymbol{\Theta}_c$ and phase transitions modeled by $\hat{f}_{ ext{p,d}}$ and reward accumulation $r_{ ext{R}}$. Key contributions include formalizing the recipe-augmented RL framework, demonstrating faster convergence and improved constraint handling compared to direct RL and baseline recipes, and achieving near-optimal performance relative to NMPC on a challenging semi-batch polymerization reactor. The results suggest that embedding expert knowledge into the RL loop yields a practically viable, interpretable approach for industrial batch operations, with future work exploring human-in-the-loop feedback, real-world deployment, and scalability to larger systems.

Abstract

Optimal operation of chemical processes is vital for energy, resource, and cost savings in chemical engineering. The problem of optimal operation can be tackled with reinforcement learning, but traditional reinforcement learning methods face challenges due to hard constraints related to quality and safety that must be strictly satisfied, and the large amount of required training data. Chemical processes often cannot provide sufficient experimental data, and while detailed dynamic models can be an alternative, their complexity makes it computationally intractable to generate the needed data. Optimal control methods, such as model predictive control, also struggle with the complexity of the underlying dynamic models. Consequently, many chemical processes rely on manually defined operation recipes combined with simple linear controllers, leading to suboptimal performance and limited flexibility. In this work, we propose a novel approach that leverages expert knowledge embedded in operation recipes. By using reinforcement learning to optimize the parameters of these recipes and their underlying linear controllers, we achieve an optimized operation recipe. This method requires significantly less data, handles constraints more effectively, and is more interpretable than traditional reinforcement learning methods due to the structured nature of the recipes. We demonstrate the potential of our approach through simulation results of an industrial batch polymerization reactor, showing that it can approach the performance of optimal controllers while addressing the limitations of existing methods.

Optimizing Operation Recipes with Reinforcement Learning for Safe and Interpretable Control of Chemical Processes

TL;DR

Abstract

Optimizing Operation Recipes with Reinforcement Learning for Safe and Interpretable Control of Chemical Processes

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)