Table of Contents
Fetching ...

Reinforcement Learning of Structured Control for Linear Systems with Unknown State Matrix

Sayak Mukherjee, Thanh Long Vu

TL;DR

Addresses learning stabilizing structured state-feedback gains for continuous-time LTI systems with unknown $A$ under a prescribed sparsity/communication structure $\mathcal{K}$. Derives a modified ARE with $K = R^{-1} B^T P - L$ and enforces structure via $L = F(\phi(P))$, ensuring stability, and then develops a model-free SRL algorithm using trajectory data with convergence guarantees and a sub-optimality bound. Proves stability (Theorem 1), structure feasibility (Theorem 2), convergence of a Kleinman-like iteration (Theorem 3), and the sub-optimality bound (Theorem 4), and validates the framework on a 6-agent network showing near-optimal performance with improved damping under structured constraints. The approach is general and scalable to distributed control in large-scale cyber-physical systems.

Abstract

This paper delves into designing stabilizing feedback control gains for continuous linear systems with unknown state matrix, in which the control is subject to a general structural constraint. We bring forth the ideas from reinforcement learning (RL) in conjunction with sufficient stability and performance guarantees in order to design these structured gains using the trajectory measurements of states and controls. We first formulate a model-based framework using dynamic programming (DP) to embed the structural constraint to the Linear Quadratic Regulator (LQR) gain computation in the continuous-time setting. Subsequently, we transform this LQR formulation into a policy iteration RL algorithm that can alleviate the requirement of known state matrix in conjunction with maintaining the feedback gain structure. Theoretical guarantees are provided for stability and convergence of the structured RL (SRL) algorithm. The introduced RL framework is general and can be applied to any control structure. A special control structure enabled by this RL framework is distributed learning control which is necessary for many large-scale cyber-physical systems. As such, we validate our theoretical results with numerical simulations on a multi-agent networked linear time-invariant (LTI) dynamic system.

Reinforcement Learning of Structured Control for Linear Systems with Unknown State Matrix

TL;DR

Addresses learning stabilizing structured state-feedback gains for continuous-time LTI systems with unknown under a prescribed sparsity/communication structure . Derives a modified ARE with and enforces structure via , ensuring stability, and then develops a model-free SRL algorithm using trajectory data with convergence guarantees and a sub-optimality bound. Proves stability (Theorem 1), structure feasibility (Theorem 2), convergence of a Kleinman-like iteration (Theorem 3), and the sub-optimality bound (Theorem 4), and validates the framework on a 6-agent network showing near-optimal performance with improved damping under structured constraints. The approach is general and scalable to distributed control in large-scale cyber-physical systems.

Abstract

This paper delves into designing stabilizing feedback control gains for continuous linear systems with unknown state matrix, in which the control is subject to a general structural constraint. We bring forth the ideas from reinforcement learning (RL) in conjunction with sufficient stability and performance guarantees in order to design these structured gains using the trajectory measurements of states and controls. We first formulate a model-based framework using dynamic programming (DP) to embed the structural constraint to the Linear Quadratic Regulator (LQR) gain computation in the continuous-time setting. Subsequently, we transform this LQR formulation into a policy iteration RL algorithm that can alleviate the requirement of known state matrix in conjunction with maintaining the feedback gain structure. Theoretical guarantees are provided for stability and convergence of the structured RL (SRL) algorithm. The introduced RL framework is general and can be applied to any control structure. A special control structure enabled by this RL framework is distributed learning control which is necessary for many large-scale cyber-physical systems. As such, we validate our theoretical results with numerical simulations on a multi-agent networked linear time-invariant (LTI) dynamic system.

Paper Structure

This paper contains 5 sections, 42 equations, 7 figures, 1 algorithm.

Figures (7)

  • Figure 1: An example of structured feedback for agent $1$
  • Figure 2: Scenario A: State trajectories during exploration and control implementation
  • Figure 3: Scenario A: P convergence
  • Figure 4: Scenario A: K convergence
  • Figure 5: Scenario B: State trajectories during exploration and control implementation
  • ...and 2 more figures