Table of Contents
Fetching ...

Model-free Resilient Controller Design based on Incentive Feedback Stackelberg Game and Q-learning

Jiajun Shen, Fengjun Li, Morteza Hashemi, Huazhen Fang

TL;DR

A resilient leading controller that is adaptive to a compromised following controller such that the compromised follower acts cooperatively with the leader, aligning its strategies with the leader’s objective to achieve a team-optimal solution.

Abstract

In the swift evolution of Cyber-Physical Systems (CPSs) within intelligent environments, especially in the industrial domain shaped by Industry 4.0, the surge in development brings forth unprecedented security challenges. This paper explores the intricate security issues of Industrial CPSs (ICPSs), with a specific focus on the unique threats presented by intelligent attackers capable of directly compromising the controller, thereby posing a direct risk to physical security. Within the framework of hierarchical control and incentive feedback Stackelberg game, we design a resilient leading controller (leader) that is adaptive to a compromised following controller (follower) such that the compromised follower acts cooperatively with the leader, aligning its strategies with the leader's objective to achieve a team-optimal solution. First, we provide sufficient conditions for the existence of an incentive Stackelberg solution when system dynamics are known. Then, we propose a Q-learning-based Approximate Dynamic Programming (ADP) approach, and corresponding algorithms for the online resolution of the incentive Stackelberg solution without requiring prior knowledge of system dynamics. Last but not least, we prove the convergence of our approach to the optimum.

Model-free Resilient Controller Design based on Incentive Feedback Stackelberg Game and Q-learning

TL;DR

A resilient leading controller that is adaptive to a compromised following controller such that the compromised follower acts cooperatively with the leader, aligning its strategies with the leader’s objective to achieve a team-optimal solution.

Abstract

In the swift evolution of Cyber-Physical Systems (CPSs) within intelligent environments, especially in the industrial domain shaped by Industry 4.0, the surge in development brings forth unprecedented security challenges. This paper explores the intricate security issues of Industrial CPSs (ICPSs), with a specific focus on the unique threats presented by intelligent attackers capable of directly compromising the controller, thereby posing a direct risk to physical security. Within the framework of hierarchical control and incentive feedback Stackelberg game, we design a resilient leading controller (leader) that is adaptive to a compromised following controller (follower) such that the compromised follower acts cooperatively with the leader, aligning its strategies with the leader's objective to achieve a team-optimal solution. First, we provide sufficient conditions for the existence of an incentive Stackelberg solution when system dynamics are known. Then, we propose a Q-learning-based Approximate Dynamic Programming (ADP) approach, and corresponding algorithms for the online resolution of the incentive Stackelberg solution without requiring prior knowledge of system dynamics. Last but not least, we prove the convergence of our approach to the optimum.
Paper Structure (11 sections, 7 theorems, 60 equations, 2 algorithms)

This paper contains 11 sections, 7 theorems, 60 equations, 2 algorithms.

Key Result

Lemma 3.1

Given Assumption assumption: strategy information structure is satisfied, the joint optimization problem admits a unique team-optimal solution $\{\pi_1^t, \pi_2^t\}$ and with minimum cost $J_1^t = x_0^T P x_0$, where

Theorems & Definitions (16)

  • Definition 2.3
  • Remark 2.4
  • Lemma 3.1
  • proof
  • Lemma 3.2
  • proof
  • Theorem 3.3
  • proof
  • Lemma 4.1
  • proof
  • ...and 6 more