Table of Contents
Fetching ...

A Reduced Order Iterative Linear Quadratic Regulator (ILQR) Technique for the Optimal Control of Nonlinear Partial Differential Equations

Aayushman Sharma, Suman Chakravorty

TL;DR

The paper tackles optimal control of nonlinear PDEs by integrating POD-based model order reduction with ILQR in an iterative reduced-order loop (RO-ILQR). It formulates a reduced, time-varying linear model around the current trajectory and solves a time-varying reduced-order LQR to update the trajectory and basis, repeating until convergence. The authors provide a convergence analysis showing the method approaches a limit set determined by truncation error and demonstrate significant computational savings on Burgers and phase-field PDEs with competitive performance relative to full ILQR. Empirical results indicateRO-ILQR reduces dimensionality from n_x+n_u to n_α+n_u, with the open-loop cost within about 14% of the true optimum and notable speedups, while outperforming a deep RL baseline that struggles to converge on these problems.

Abstract

In this paper, we introduce a reduced order model-based reinforcement learning (MBRL) approach, utilizing the Iterative Linear Quadratic Regulator (ILQR) algorithm for the optimal control of nonlinear partial differential equations (PDEs). The approach proposes a novel modification of the ILQR technique: it uses the Method of Snapshots to identify a reduced order Linear Time Varying (LTV) approximation of the nonlinear PDE dynamics around a current estimate of the optimal trajectory, utilizes the identified LTV model to solve a time-varying reduced order LQR problem to obtain an improved estimate of the optimal trajectory along with a new reduced basis, and iterates till convergence. The convergence behavior of the reduced order approach is analyzed and the algorithm is shown to converge to a limit set that is dependent on the truncation error in the reduction. The proposed approach is tested on the viscous Burger's equation and two phase-field models for microstructure evolution in materials, and the results show that there is a significant reduction in the computational burden over the standard ILQR approach, without significantly sacrificing performance.

A Reduced Order Iterative Linear Quadratic Regulator (ILQR) Technique for the Optimal Control of Nonlinear Partial Differential Equations

TL;DR

The paper tackles optimal control of nonlinear PDEs by integrating POD-based model order reduction with ILQR in an iterative reduced-order loop (RO-ILQR). It formulates a reduced, time-varying linear model around the current trajectory and solves a time-varying reduced-order LQR to update the trajectory and basis, repeating until convergence. The authors provide a convergence analysis showing the method approaches a limit set determined by truncation error and demonstrate significant computational savings on Burgers and phase-field PDEs with competitive performance relative to full ILQR. Empirical results indicateRO-ILQR reduces dimensionality from n_x+n_u to n_α+n_u, with the open-loop cost within about 14% of the true optimum and notable speedups, while outperforming a deep RL baseline that struggles to converge on these problems.

Abstract

In this paper, we introduce a reduced order model-based reinforcement learning (MBRL) approach, utilizing the Iterative Linear Quadratic Regulator (ILQR) algorithm for the optimal control of nonlinear partial differential equations (PDEs). The approach proposes a novel modification of the ILQR technique: it uses the Method of Snapshots to identify a reduced order Linear Time Varying (LTV) approximation of the nonlinear PDE dynamics around a current estimate of the optimal trajectory, utilizes the identified LTV model to solve a time-varying reduced order LQR problem to obtain an improved estimate of the optimal trajectory along with a new reduced basis, and iterates till convergence. The convergence behavior of the reduced order approach is analyzed and the algorithm is shown to converge to a limit set that is dependent on the truncation error in the reduction. The proposed approach is tested on the viscous Burger's equation and two phase-field models for microstructure evolution in materials, and the results show that there is a significant reduction in the computational burden over the standard ILQR approach, without significantly sacrificing performance.
Paper Structure (35 sections, 5 theorems, 48 equations, 12 figures, 2 tables, 1 algorithm)

This paper contains 35 sections, 5 theorems, 48 equations, 12 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

Under assumptions AA1 and AA3, given the same control sequence $\delta U$, the costs of the eq:FO-LQR and eq:RO-LQR satisfy: where $\Bar{C}_1=7(T+1)\Bar{C}$ and $\delta U = \{\delta u_t\}_{t=0}^{T-1}$ (Fig. fig:costs_close_samectrl).

Figures (12)

  • Figure 1: The initial trajectory (a)-(d) differs significantly from the final trajectory optimized by ILQR (i)-(l). Thus, the set of basis eigenfunctions for the initial trajectory (e)-(h) will only be valid locally, and differ significantly from the reduced order subspace that the optimal trajectory lies on (m)-(p). Hence, it is key to update the reduced-order basis with each successive iteration of the algorithm.
  • Figure 2: Relative performance of the Full-order vs Reduced-order LTV systems identified w.r.t. the ground truth, benchmarked on the Allen-Cahn Equation. The trajectory is appended with a Gaussian noise of std 10% and 30% of the max control input.
  • Figure 3: The cost functions $\delta J(\cdot)$ and $\delta\hat{J}(\cdot)$ are close to each other given the same control input.
  • Figure 4: The solutions of the full and reduced order perturbed LQR problems are close to each other, under the specified assumptions.
  • Figure 5: The set $S_\infty$ is compact, and enclosed between sub-level sets $\underline{S}$ and $\Bar{S}$.
  • ...and 7 more figures

Theorems & Definitions (13)

  • Remark 1
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • Remark 2
  • ...and 3 more