Table of Contents
Fetching ...

Dual MPC for Active Learning of Nonparametric Uncertainties

Tren Baltussen, Maurice Heemels, Alexander Katriniok

TL;DR

The paper tackles safe active learning for systems with nonparametric uncertainties by integrating a Gaussian process model within a dual model predictive control framework. It introduces an explicit dual effect through an active-learning MPC that conditions the GP covariance on the predicted control sequence, and ensures safety via a multi-horizon contingency MPC. The main contributions are the explicit dual MPC with active learning, a recursive feasibility proof via contingency horizons, and a numerical study on a nonlinear system showing improved learning and control performance while maintaining safety. The results indicate that the proposed approach can safely excite the system to learn unknown dynamics and achieve performance comparable to or better than baseline methods, with practical computation times. This advances safe data-efficient control for uncertain, safety-critical systems by marrying active learning with robust, constrained MPC.

Abstract

This manuscript presents a dual model predictive controller (MPC) that balances the two objectives of dual control, namely, system identification and control. In particular, we propose a Gaussian process (GP)-based MPC that uses the posterior GP covariance for active learning. The dual MPC can steer the system towards states with high covariance, or to the setpoint, thereby balancing system identification and control performance (exploration vs. exploitation). We establish robust constraint satisfaction of the novel dual MPC through a contingency plan. We demonstrate the dual MPC in a numerical study of a nonlinear system with nonparametric uncertainties.

Dual MPC for Active Learning of Nonparametric Uncertainties

TL;DR

The paper tackles safe active learning for systems with nonparametric uncertainties by integrating a Gaussian process model within a dual model predictive control framework. It introduces an explicit dual effect through an active-learning MPC that conditions the GP covariance on the predicted control sequence, and ensures safety via a multi-horizon contingency MPC. The main contributions are the explicit dual MPC with active learning, a recursive feasibility proof via contingency horizons, and a numerical study on a nonlinear system showing improved learning and control performance while maintaining safety. The results indicate that the proposed approach can safely excite the system to learn unknown dynamics and achieve performance comparable to or better than baseline methods, with practical computation times. This advances safe data-efficient control for uncertain, safety-critical systems by marrying active learning with robust, constrained MPC.

Abstract

This manuscript presents a dual model predictive controller (MPC) that balances the two objectives of dual control, namely, system identification and control. In particular, we propose a Gaussian process (GP)-based MPC that uses the posterior GP covariance for active learning. The dual MPC can steer the system towards states with high covariance, or to the setpoint, thereby balancing system identification and control performance (exploration vs. exploitation). We establish robust constraint satisfaction of the novel dual MPC through a contingency plan. We demonstrate the dual MPC in a numerical study of a nonlinear system with nonparametric uncertainties.

Paper Structure

This paper contains 23 sections, 3 theorems, 14 equations, 2 figures.

Key Result

Theorem 1

The MPC policy from eq:Prim_MPC$u^{N\text{-}\mathrm{fb}}_k(\xi_{k \mid_{k}^{k+N}}) = \kappa(x_k, \hat{x}^*_{0 \mid k}) + \hat{u}^*_{0 \mid k}(\xi_{k \mid_{k}^{k+N}})$ is an $N$-measurement feedback policy that has the dual effect defined in Definition def:DualControl.

Figures (2)

  • Figure 1: Schematic of the mass-spring-damper system.
  • Figure 2: Results of simulation study. The black and red dashed line indicate the MPC setpoint and state constraints, respectively. The setpoint changes after 5 seconds and the MPC does not have a preview of this change.

Theorems & Definitions (10)

  • Definition 1
  • Definition 2
  • Remark 1
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Remark 2
  • Theorem 3
  • proof