Table of Contents
Fetching ...

Bi-Level-Based Inverse Stochastic Optimal Control

Philipp Karg, Manuel Hess, Balint Varga, Sören Hohmann

TL;DR

This paper proposes a new algorithm to solve the Inverse Stochastic Optimal Control (ISOC) problem of the linear-quadratic sensorimotor (LQS) control model and proves global convergence for the new algorithm.

Abstract

In this paper, we propose a new algorithm to solve the Inverse Stochastic Optimal Control (ISOC) problem of the linear-quadratic sensorimotor (LQS) control model. The LQS model represents the current state-of-the-art in describing goal-directed human movements. The ISOC problem aims at determining the cost function and noise scaling matrices of the LQS model from measurement data since both parameter types influence the statistical moments predicted by the model and are unknown in practice. We prove global convergence for our new algorithm and at a numerical example, validate the theoretical assumptions of our method. By comprehensive simulations, the influence of the tuning parameters of our algorithm on convergence behavior and computation time is analyzed. The new algorithm computes ISOC solutions nearly 33 times faster than the single previously existing ISOC algorithm.

Bi-Level-Based Inverse Stochastic Optimal Control

TL;DR

This paper proposes a new algorithm to solve the Inverse Stochastic Optimal Control (ISOC) problem of the linear-quadratic sensorimotor (LQS) control model and proves global convergence for the new algorithm.

Abstract

In this paper, we propose a new algorithm to solve the Inverse Stochastic Optimal Control (ISOC) problem of the linear-quadratic sensorimotor (LQS) control model. The LQS model represents the current state-of-the-art in describing goal-directed human movements. The ISOC problem aims at determining the cost function and noise scaling matrices of the LQS model from measurement data since both parameter types influence the statistical moments predicted by the model and are unknown in practice. We prove global convergence for our new algorithm and at a numerical example, validate the theoretical assumptions of our method. By comprehensive simulations, the influence of the tuning parameters of our algorithm on convergence behavior and computation time is analyzed. The new algorithm computes ISOC solutions nearly 33 times faster than the single previously existing ISOC algorithm.
Paper Structure (12 sections, 7 theorems, 17 equations, 2 figures, 3 tables, 2 algorithms)

This paper contains 12 sections, 7 theorems, 17 equations, 2 figures, 3 tables, 2 algorithms.

Key Result

Lemma 1

Let $\bm{R}$ and $\bm{\Sigma}^{\bm{\beta}}{\bm{\Sigma}^{\bm{\beta}}}^\intercal$ be positive definite. Furthermore, let the history of control and output values for the admissible control strategies $\bm{u}_t = \bm{\pi}_t(\bm{u}_0,\dots,\bm{u}_{t-1},\bm{y}_0,\dots,\bm{y}_{t-1})$ (cf. Problem problem: where $\bm{K}_t$ ($\forall t \in \{0,\dots,N-1\}$) are constant filter matrices of appropriate dime

Figures (2)

  • Figure 1: Validation of Assumptions \ref{['assumption:J_ISOC_nonconvex']} and \ref{['assumption:J_ISOC_twcontdiff']}.
  • Figure 2: Mean and covariance of position $p_x$ and velocity $\dot{p}_x$ of the human hand in the example system. Values achieved with GT parameters and parameters identified with the TRLwARoA algorithm and the method in Karg.2023a are shown.

Theorems & Definitions (23)

  • Lemma 1
  • proof
  • Theorem 1
  • proof
  • Corollary 1
  • proof
  • Definition 1
  • Remark 1
  • Definition 2
  • Definition 3
  • ...and 13 more