Model-free Resilient Controller Design based on Incentive Feedback Stackelberg Game and Q-learning

Jiajun Shen; Fengjun Li; Morteza Hashemi; Huazhen Fang

Model-free Resilient Controller Design based on Incentive Feedback Stackelberg Game and Q-learning

Jiajun Shen, Fengjun Li, Morteza Hashemi, Huazhen Fang

TL;DR

A resilient leading controller that is adaptive to a compromised following controller such that the compromised follower acts cooperatively with the leader, aligning its strategies with the leader’s objective to achieve a team-optimal solution.

Abstract

In the swift evolution of Cyber-Physical Systems (CPSs) within intelligent environments, especially in the industrial domain shaped by Industry 4.0, the surge in development brings forth unprecedented security challenges. This paper explores the intricate security issues of Industrial CPSs (ICPSs), with a specific focus on the unique threats presented by intelligent attackers capable of directly compromising the controller, thereby posing a direct risk to physical security. Within the framework of hierarchical control and incentive feedback Stackelberg game, we design a resilient leading controller (leader) that is adaptive to a compromised following controller (follower) such that the compromised follower acts cooperatively with the leader, aligning its strategies with the leader's objective to achieve a team-optimal solution. First, we provide sufficient conditions for the existence of an incentive Stackelberg solution when system dynamics are known. Then, we propose a Q-learning-based Approximate Dynamic Programming (ADP) approach, and corresponding algorithms for the online resolution of the incentive Stackelberg solution without requiring prior knowledge of system dynamics. Last but not least, we prove the convergence of our approach to the optimum.

Model-free Resilient Controller Design based on Incentive Feedback Stackelberg Game and Q-learning

TL;DR

Abstract

Paper Structure (11 sections, 7 theorems, 60 equations, 2 algorithms)

This paper contains 11 sections, 7 theorems, 60 equations, 2 algorithms.

Introduction
ICPS Security and Resilient Control
Incentive Feedback Stackelberg Game
Problem Formulation
Incentive Feedback Stackelberg Game with Known System Dynamics
Q-learning-based Approximate Dynamic Programming with Unknown Dynamics
Q-function for joint optimization problem
Online derivation of team-optimal solution
Convergence to the team-optimal solution
Follower's optimization problem
Conclusion

Key Result

Lemma 3.1

Given Assumption assumption: strategy information structure is satisfied, the joint optimization problem admits a unique team-optimal solution $\{\pi_1^t, \pi_2^t\}$ and with minimum cost $J_1^t = x_0^T P x_0$, where

Theorems & Definitions (16)

Definition 2.3
Remark 2.4
Lemma 3.1
proof
Lemma 3.2
proof
Theorem 3.3
proof
Lemma 4.1
proof
...and 6 more

Model-free Resilient Controller Design based on Incentive Feedback Stackelberg Game and Q-learning

TL;DR

Abstract

Model-free Resilient Controller Design based on Incentive Feedback Stackelberg Game and Q-learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (16)