Table of Contents
Fetching ...

Multi-Mode Process Control Using Multi-Task Inverse Reinforcement Learning

Runze Lin, Junghui Chen, Biao Huang, Lei Xie, Hongye Su

TL;DR

The paper addresses multi-mode process control, where a single controller must operate across modes with differing dynamics using offline closed-loop data. It proposes a context-conditioned multi-task IRL framework that uses a latent context variable $z$ to encode mode information, learning mode-specific rewards $r(x,u,z)$ and policies $\pi(u|x,z)$ via MaxEnt IRL and AIRL with mutual-information regularization $I(z;\tau)$. The contributions include formalizing context-conditioned multi-task IRL for process control, a practical three-network training procedure, and validation on a fed-batch bioreactor and a CSTR, demonstrating the ability to recover mode-specific behavior from offline data. The approach enables safe offline learning of mode-aware controller priors and supports rapid transfer to unseen modes with reduced environmental interaction, advancing practical deployment in industrial settings.

Abstract

In the era of Industry 4.0 and smart manufacturing, process systems engineering must adapt to digital transformation. While reinforcement learning offers a model-free approach to process control, its applications are limited by the dependence on accurate digital twins and well-designed reward functions. To address these limitations, this paper introduces a novel framework that integrates inverse reinforcement learning (IRL) with multi-task learning for data-driven, multi-mode control design. Using historical closed-loop data as expert demonstrations, IRL extracts optimal reward functions and control policies. A latent-context variable is incorporated to distinguish modes, enabling the training of mode-specific controllers. Case studies on a continuous stirred tank reactor and a fed-batch bioreactor validate the effectiveness of this framework in handling multi-mode data and training adaptable controllers.

Multi-Mode Process Control Using Multi-Task Inverse Reinforcement Learning

TL;DR

The paper addresses multi-mode process control, where a single controller must operate across modes with differing dynamics using offline closed-loop data. It proposes a context-conditioned multi-task IRL framework that uses a latent context variable to encode mode information, learning mode-specific rewards and policies via MaxEnt IRL and AIRL with mutual-information regularization . The contributions include formalizing context-conditioned multi-task IRL for process control, a practical three-network training procedure, and validation on a fed-batch bioreactor and a CSTR, demonstrating the ability to recover mode-specific behavior from offline data. The approach enables safe offline learning of mode-aware controller priors and supports rapid transfer to unseen modes with reduced environmental interaction, advancing practical deployment in industrial settings.

Abstract

In the era of Industry 4.0 and smart manufacturing, process systems engineering must adapt to digital transformation. While reinforcement learning offers a model-free approach to process control, its applications are limited by the dependence on accurate digital twins and well-designed reward functions. To address these limitations, this paper introduces a novel framework that integrates inverse reinforcement learning (IRL) with multi-task learning for data-driven, multi-mode control design. Using historical closed-loop data as expert demonstrations, IRL extracts optimal reward functions and control policies. A latent-context variable is incorporated to distinguish modes, enabling the training of mode-specific controllers. Case studies on a continuous stirred tank reactor and a fed-batch bioreactor validate the effectiveness of this framework in handling multi-mode data and training adaptable controllers.

Paper Structure

This paper contains 21 sections, 32 equations, 9 figures, 1 algorithm.

Figures (9)

  • Figure 1: Multi-task inverse reinforcement learning framework for designing multi-mode process control systems.
  • Figure 2: Flowchart of the proposed multi-task inverse reinforcement learning scheme.
  • Figure 3: Typical batch optimization profiles of the TRPO expert demonstrations (left: Mode 1 $k = 0.5$; right: Mode 2 $k = 0.7$).
  • Figure 4: Batch optimization profiles of the successfully trained multi-task IRL agent based on the TRPO expert demonstrations (left: Mode 1 $k = 0.5$; right: Mode 2 $k = 0.7$).
  • Figure 5: Sketch of the CSTR control system.
  • ...and 4 more figures