On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems

Emiland Garrabe; Hozefa Jesawada; Carmen Del Vecchio; Giovanni Russo

On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems

Emiland Garrabe, Hozefa Jesawada, Carmen Del Vecchio, Giovanni Russo

TL;DR

A result enabling cost reconstruction by solving an optimization problem that is convex even when the agent cost is not and when the underlying dynamics is nonlinear, non-stationary and stochastic is presented.

Abstract

This paper is concerned with a finite-horizon inverse control problem, which has the goal of reconstructing, from observations, the possibly non-convex and non-stationary cost driving the actions of an agent. In this context, we present a result enabling cost reconstruction by solving an optimization problem that is convex even when the agent cost is not and when the underlying dynamics is nonlinear, non-stationary and stochastic. To obtain this result, we also study a finite-horizon forward control problem that has randomized policies as decision variables. We turn our findings into algorithmic procedures and show the effectiveness of our approach via in-silico and hardware validations. All experiments confirm the effectiveness of our approach.

On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems

TL;DR

Abstract

Paper Structure (17 sections, 4 theorems, 52 equations, 5 figures, 1 table, 2 algorithms)

This paper contains 17 sections, 4 theorems, 52 equations, 5 figures, 1 table, 2 algorithms.

Introduction
Contributions
Mathematical Background
Problems Set-up
The Forward Control Problem
The Inverse Control Problem
Main Results
Tackling the Forward Control Problem
Turning Theorem \ref{['thm:prob_main']} into an Algorithm
Tackling the Inverse Control Problem
Turning Corollary \ref{['clry: estimator']} into an Algorithm
Special Case
Application Example
Conclusions
Appendix
...and 2 more sections

Key Result

Lemma 1

Let $\mathbf{V}$ and $\mathbf{Z}$ be two random variables and let $f(\mathbf{v},\mathbf{z})$ and $g(\mathbf{v},\mathbf{z})$ be two joint pfs. Then:

Figures (5)

Figure 1: Target pendulum angular position and corresponding control input. Results obtained when: (i) pfs are discrete, estimated via the histogram filter (left panels); (ii) pfs are estimated via Gaussian Processes (right panels). Panels obtained from $20$ simulations. Bold lines represent the mean and the shaded region is confidence interval corresponding to the standard deviation.
Figure 2: Angular position and control input of the target Pendulum when the pf is estimated via the histogram filter (leftand middle panels) and Gaussian Processes (right panels). Figures obtained from $20$ simulations, using ${c}^{\star}(\cdot)$ as an input to Algorithm \ref{['alg:main']}. Bold lines represents the mean; the shaded region is confidence interval corresponding to the standard deviation.
Figure 3: Top left: original cost function. In the other panels the cost reconstructed via: Algorithm \ref{['alg:estimator']} (top-right), MaxEnt (bottom-left) and IHMCE (bottom-right).
Figure 4: Top-left: robot trajectories starting from different initial positions ($\star$) when the policy in \ref{['eqn:gaussian_policy']} - \ref{['eqn:gaussian_policy_recursion']} is used (with $N=1$). Top-right: the $\mathbf{o}_i$'s together with the weights obtained via Algorithm \ref{['alg:estimator']}. Bottom: reconstructed cost (left) and robot trajectories when Algorithm \ref{['alg:main']} is used with this cost. Robot starts from initial positions that are different from those in the top panel.
Figure 5: Top-left: cost for the FOC problem. Top-right: robot trajectories when the policy from Algorithm \ref{['alg:main']} is used (same initial positions and destination of Scenario $1$). Bottom panels: cost reconstructed via Algorithm \ref{['alg:estimator']} (left) and robot trajectories when Algorithm \ref{['alg:main']} is used with the estimated cost. Robots start from initial positions that are different from these in the top panel.

Theorems & Definitions (16)

Lemma 1
Remark 1
Remark 2
Remark 3
Remark 4
Remark 5
Theorem 1
Remark 6
Remark 7
Remark 8
...and 6 more

On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems

TL;DR

Abstract

On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (16)