A Model-Free Optimal Control Method With Fixed Terminal States and Delay

Mi Zhou; Erik Verriest; Chaouki Abdallah

A Model-Free Optimal Control Method With Fixed Terminal States and Delay

Mi Zhou, Erik Verriest, Chaouki Abdallah

TL;DR

A new model-free algorithm is proposed based on basis functions, gradient estimation, and the Lagrange method for optimality conditions-based control of state-dependent switched systems and time-delay systems.

Abstract

Model-free algorithms are brought into the control system's research with the emergence of reinforcement learning algorithms. However, there are two practical challenges of reinforcement learning-based methods. First, learning by interacting with the environment is highly complex. Second, constraints on the states (boundary conditions) require additional care since the state trajectory is implicitly defined from the inputs and system dynamics. To address these problems, this paper proposes a new model-free algorithm based on basis functions, gradient estimation, and the Lagrange method. The favorable performance of the proposed algorithm is shown using several examples under state-dependent switches and time delays.

A Model-Free Optimal Control Method With Fixed Terminal States and Delay

TL;DR

Abstract

Paper Structure (13 sections, 3 theorems, 23 equations, 8 figures, 3 tables)

This paper contains 13 sections, 3 theorems, 23 equations, 8 figures, 3 tables.

Introduction
Problem formulated
Proposed algorithm
Gradient estimation
Augmented Lagrangian method and Dual decomposition
Framework of proposed algorithm
Convergence analysis
MATLAB GUI toolbox
Illustrated examples
Example 1: a first order system
Example 2: State-dependent switched systems
Example 3: Time-delay systems
Conclusions

Key Result

Lemma 1

Assume The iterates $(\theta_n, \mu_n)$ converge to a fixed point could be a local one almost surely, which is a feasible solution.

Figures (8)

Figure 1: Framework of the proposed algorithm.
Figure 2: MATLAB GUI for the solver.
Figure 3: Example 1: (a) state $x(t)$ under different basis functions (magenta: Chebyshev; green: Legendre; blue: Fourier); (b) control input $u(t)$ under different basis functions (magenta: Chebyshev; green: Legendre; blue: Fourier); (c) cost $J$ with respect to number of basis functions used.
Figure 4: Example 1: Heatmap of the parameters after convergence using different basis: (a)Chebyshev ($m=4$, $\alpha=0.01$) (b) Legendre ($m=6$, $\alpha=0.01$) (c) Fourier ($m=4$, $\alpha=0.01$).
Figure 5: Example 2: (a) state $x_1(t)$ under different basis functions (magenta: Chebyshev; green: Legendre; blue: Fourier); (b) state $x_2(t)$; (c) control input $u(t)$ under different basis functions (magenta: Chebyshev; green: Legendre; blue: Fourier); (d) cost $J$ with respect to number of basis functions used.
...and 3 more figures

Theorems & Definitions (5)

Lemma 1
Definition 1: Epi-convergence
Lemma 2
Theorem 1
proof

A Model-Free Optimal Control Method With Fixed Terminal States and Delay

TL;DR

Abstract

A Model-Free Optimal Control Method With Fixed Terminal States and Delay

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (5)