Table of Contents
Fetching ...

Estimating unknown dynamics and cost as a bilinear system with Koopman-based Inverse Optimal Control

Victor Nan Fernandez-Ayala, Shankar A. Deka, Dimos V. Dimarogonas

TL;DR

This work tackles learning unknown dynamics and cost functions by casting nonlinear control systems as separable bilinear Koopman models learned with a modified EDMDc, achieving exact dynamical equivalence with the original system. By deriving Pontryagin's Maximum Principle for the bilinear model, the IOC problem reduces to a Bi-LQR framework that is more tractable than traditional nonlinear IOC methods. The paper establishes conditions for dynamical and cost identifiability, develops a Bilinear EDMDc and inverse PMP procedure to estimate the lifted dynamics and quadratic cost $Q$, and demonstrates effectiveness through theory, simulations, and a robotic experiment. The approach offers a robust, data-driven pathway for estimating unknown dynamics and costs in robotics and human motion prediction, enabling short-horizon prediction and informed control design without requiring a fully known nonlinear model.

Abstract

In this work, we address the challenge of approximating unknown system dynamics and costs by representing them as a bilinear system using Koopman-based Inverse Optimal Control (IOC). Using optimal trajectories, we construct a bilinear control system in transformed state variables through a modified Extended Dynamic Mode Decomposition with control (EDMDc) that maintains exact dynamical equivalence with the original nonlinear system. We derive Pontryagin's Maximum Principle (PMP) optimality conditions for this system, which closely resemble those of the inverse Linear Quadratic Regulator (LQR) problem due to the consistent control input and state independence from the control. This similarity allows us to apply modified inverse LQR theory, offering a more tractable and robust alternative to nonlinear Inverse Optimal Control methods, especially when dealing with unknown dynamics. Our approach also benefits from the extensive analytical properties of bilinear control systems, providing a solid foundation for further analysis and application. We demonstrate the effectiveness of the proposed method through theoretical analysis, simulation studies and a robotic experiment, highlighting its potential for broader applications in the approximation and design of control systems.

Estimating unknown dynamics and cost as a bilinear system with Koopman-based Inverse Optimal Control

TL;DR

This work tackles learning unknown dynamics and cost functions by casting nonlinear control systems as separable bilinear Koopman models learned with a modified EDMDc, achieving exact dynamical equivalence with the original system. By deriving Pontryagin's Maximum Principle for the bilinear model, the IOC problem reduces to a Bi-LQR framework that is more tractable than traditional nonlinear IOC methods. The paper establishes conditions for dynamical and cost identifiability, develops a Bilinear EDMDc and inverse PMP procedure to estimate the lifted dynamics and quadratic cost , and demonstrates effectiveness through theory, simulations, and a robotic experiment. The approach offers a robust, data-driven pathway for estimating unknown dynamics and costs in robotics and human motion prediction, enabling short-horizon prediction and informed control design without requiring a fully known nonlinear model.

Abstract

In this work, we address the challenge of approximating unknown system dynamics and costs by representing them as a bilinear system using Koopman-based Inverse Optimal Control (IOC). Using optimal trajectories, we construct a bilinear control system in transformed state variables through a modified Extended Dynamic Mode Decomposition with control (EDMDc) that maintains exact dynamical equivalence with the original nonlinear system. We derive Pontryagin's Maximum Principle (PMP) optimality conditions for this system, which closely resemble those of the inverse Linear Quadratic Regulator (LQR) problem due to the consistent control input and state independence from the control. This similarity allows us to apply modified inverse LQR theory, offering a more tractable and robust alternative to nonlinear Inverse Optimal Control methods, especially when dealing with unknown dynamics. Our approach also benefits from the extensive analytical properties of bilinear control systems, providing a solid foundation for further analysis and application. We demonstrate the effectiveness of the proposed method through theoretical analysis, simulation studies and a robotic experiment, highlighting its potential for broader applications in the approximation and design of control systems.

Paper Structure

This paper contains 25 sections, 10 theorems, 67 equations, 8 figures, 1 algorithm.

Key Result

Lemma 1

If $M(T-2)m \geq N(N + 1)/2$ and $\mathscr{A}(z)\mathscr{D}$ has full column rank, then the $Q \in S^n_+$ that corresponds to the given optimal trajectories $z_{0: T-1}^{(1: M)}$ is unique.

Figures (8)

  • Figure 1: Comparison between inverse LQR and nonlinear IOC approaches. Unlike prior linearization approaches, our framework uses a common parameterization for both dynamics and cost, enabling tractable IOC even with unknown nonlinear dynamics.
  • Figure 2: Evolution of the system for Example \ref{['ex:controllable']}.
  • Figure 3: Evolution of the simulated robot teleoperated by a human.
  • Figure 4: (a) The TurtleBot3 robot used in the experiments. (b) Recorded trajectories of the TurtleBot3 robot during teleoperation.
  • Figure 5: Trajectory 26: Actual vs. predicted trajectory comparison.
  • ...and 3 more figures

Theorems & Definitions (28)

  • Remark 1
  • Remark 2
  • Example 1
  • Remark 3
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Definition 1
  • Lemma 3
  • ...and 18 more