Table of Contents
Fetching ...

A PAC-Bayesian Framework for Optimal Control with Stability Guarantees

Mahrokh Ghoddousi Boroujeni, Clara Lucía Galimberti, Andreas Krause, Giancarlo Ferrari-Trecate

TL;DR

The paper addresses the risk of overfitting in stochastic nonlinear optimal control (SNOC) when generalization to out-of-sample disturbances is critical. It introduces a PAC-Bayes bound for SNOC with randomized predictors and derives a Gibbs-optimal posterior $\mathcal{Q}^*$, including a practical tuning of $\lambda$ and a transformed, bounded loss $\tilde{L}$ to enable meaningful guarantees. Stability is ensured by adopting an unconstrained stabilizing controller parametrization based on NeurSLS and RENs, with posterior sampling carried out via Stein Variational Gradient Descent (SVGD). Experiments on a simple LTI system and cooperative planar robots demonstrate improved generalization, effective incorporation of prior knowledge through informative priors, and scalability to high-parameter controllers.

Abstract

Stochastic Nonlinear Optimal Control (SNOC) involves minimizing a cost function that averages out the random uncertainties affecting the dynamics of nonlinear systems. For tractability reasons, this problem is typically addressed by minimizing an empirical cost, which represents the average cost across a finite dataset of sampled disturbances. However, this approach raises the challenge of quantifying the control performance against out-of-sample uncertainties. Particularly, in scenarios where the training dataset is small, SNOC policies are prone to overfitting, resulting in significant discrepancies between the empirical cost and the true cost, i.e., the average SNOC cost incurred during control deployment. Therefore, establishing generalization bounds on the true cost is crucial for ensuring reliability in real-world applications. In this paper, we introduce a novel approach that leverages PAC-Bayes theory to provide rigorous generalization bounds for SNOC. Based on these bounds, we propose a new method for designing optimal controllers, offering a principled way to incorporate prior knowledge into the synthesis process, which aids in improving the control policy and mitigating overfitting. Furthermore, by leveraging recent parametrizations of stabilizing controllers for nonlinear systems, our framework inherently ensures closed-loop stability. The effectiveness of our proposed method in incorporating prior knowledge and combating overfitting is shown by designing neural network controllers for tasks in cooperative robotics.

A PAC-Bayesian Framework for Optimal Control with Stability Guarantees

TL;DR

The paper addresses the risk of overfitting in stochastic nonlinear optimal control (SNOC) when generalization to out-of-sample disturbances is critical. It introduces a PAC-Bayes bound for SNOC with randomized predictors and derives a Gibbs-optimal posterior , including a practical tuning of and a transformed, bounded loss to enable meaningful guarantees. Stability is ensured by adopting an unconstrained stabilizing controller parametrization based on NeurSLS and RENs, with posterior sampling carried out via Stein Variational Gradient Descent (SVGD). Experiments on a simple LTI system and cooperative planar robots demonstrate improved generalization, effective incorporation of prior knowledge through informative priors, and scalability to high-parameter controllers.

Abstract

Stochastic Nonlinear Optimal Control (SNOC) involves minimizing a cost function that averages out the random uncertainties affecting the dynamics of nonlinear systems. For tractability reasons, this problem is typically addressed by minimizing an empirical cost, which represents the average cost across a finite dataset of sampled disturbances. However, this approach raises the challenge of quantifying the control performance against out-of-sample uncertainties. Particularly, in scenarios where the training dataset is small, SNOC policies are prone to overfitting, resulting in significant discrepancies between the empirical cost and the true cost, i.e., the average SNOC cost incurred during control deployment. Therefore, establishing generalization bounds on the true cost is crucial for ensuring reliability in real-world applications. In this paper, we introduce a novel approach that leverages PAC-Bayes theory to provide rigorous generalization bounds for SNOC. Based on these bounds, we propose a new method for designing optimal controllers, offering a principled way to incorporate prior knowledge into the synthesis process, which aids in improving the control policy and mitigating overfitting. Furthermore, by leveraging recent parametrizations of stabilizing controllers for nonlinear systems, our framework inherently ensures closed-loop stability. The effectiveness of our proposed method in incorporating prior knowledge and combating overfitting is shown by designing neural network controllers for tasks in cooperative robotics.
Paper Structure (15 sections, 4 theorems, 17 equations, 3 figures, 2 tables)

This paper contains 15 sections, 4 theorems, 17 equations, 3 figures, 2 tables.

Key Result

Theorem 1

Consider the noise distribution $\mathcal{D}_{T:0}$, a dataset $\mathbb{S} \sim \mathcal{D}_{T:0}^s$, a prior distribution $\mathcal{P}$ independent of $\mathbb{S}$, and the FH cost $L$. Under Assumption assumption, for every $\lambda>0$, confidence level $\delta \in (0,1)$, posterior distribution $ holds with probability at least $1-\delta$ over simultaneously sampling $\mathbb{S} \sim \mathcal{D

Figures (3)

  • Figure 1: Discretized PDF for the prior distributions (left) and the optimal posterior distributions with $s=8$ (middle) and $s=512$ (right). The top and bottom rows correspond to $\mathcal{P}_\mathcal{U}$ and $\mathcal{P}_\mathcal{N}$, respectively. Horizontal and vertical axes in each plot represent $\beta$ and $k$, while color indicates the PDF. The empirical and benchmark controllers are marked.
  • Figure 2: Comparison of the true cost, $\mathcal{L}$, and the upper bound \ref{['eq:bound_qstar']} for various configurations as a function of $s$. Colors denote $\delta$ and prior distribution choices. The true cost in each setup is approximated for $10$ vectors $\theta$ sampled from $\mathcal{Q}^*$, shown as vertically aligned circles.
  • Figure 3: Closed-loop test trajectories in the interval $[0,400]$ of (a) the pre-stabilized system before training; (b)-(c) the trained empirical controller; and (d)-(f) one sampled controller from $\mathcal{Q}^*$ using SVGD. Training initial conditions are marked with $\circ$. Snapshots are taken at instants $\tau$ indicated in each plot. Colored balls represent the agents, with their radius indicating their size for collision avoidance.

Theorems & Definitions (5)

  • Theorem 1: Adapted from Theorem 2.7 in userfriendly
  • Corollary 1: Lemma 1.1.3 in Catoni
  • Proposition 1
  • Proposition 2
  • proof