Table of Contents
Fetching ...

Best Response Strategies for Asymmetric Sensing in Linear-Quadratic Differential Games

Shubham Aggarwal, Tamer Başar, Dipankar Maity

TL;DR

This paper addresses a two-player infinite-horizon LQ zero-sum differential game with asymmetric sensing, where the sensing-constrained Player P1 and the continuously sensing Player P2 interact. The authors derive the optimal estimator–controller pair for P1 under a stationary randomized sensing policy, showing that the control law is u1^*(t) = - R1^{-1} B1^⊤ P x̂^*(t) with a separated estimator that updates at sensing times. They transform the continuous-age-of-information sensing problem into a countable-state MDP via time discretization, proving that the optimal sensing policy is threshold-based; to meet the sensing budget, they construct a randomized policy using a mixture of two Lagrange multipliers λ. Numerical simulations in a scalar setting validate the theory, showing cost reductions with finer sensing and budget increases and illustrating threshold values that govern sensing events. The framework provides actionable strategies for intermittent sensing in competitive networked control scenarios and informs future work on equilibrium sensing strategies under asymmetric information.

Abstract

In this paper, we revisit the two-player continuous-time infinite-horizon linear quadratic differential game problem, where one of the players can sample the state of the system only intermittently due to a sensing constraint while the other player can do so continuously. Under these asymmetric sensing limitations between the players, we analyze the optimal sensing and control strategies for the player at a disadvantage while the other player continues to play its security strategy. We derive an optimal sensor policy within the class of stationary randomized policies. Finally, using simulations, we show that the expected cost accrued by the first player approaches its security level as its sensing limitation is relaxed.

Best Response Strategies for Asymmetric Sensing in Linear-Quadratic Differential Games

TL;DR

This paper addresses a two-player infinite-horizon LQ zero-sum differential game with asymmetric sensing, where the sensing-constrained Player P1 and the continuously sensing Player P2 interact. The authors derive the optimal estimator–controller pair for P1 under a stationary randomized sensing policy, showing that the control law is u1^*(t) = - R1^{-1} B1^⊤ P x̂^*(t) with a separated estimator that updates at sensing times. They transform the continuous-age-of-information sensing problem into a countable-state MDP via time discretization, proving that the optimal sensing policy is threshold-based; to meet the sensing budget, they construct a randomized policy using a mixture of two Lagrange multipliers λ. Numerical simulations in a scalar setting validate the theory, showing cost reductions with finer sensing and budget increases and illustrating threshold values that govern sensing events. The framework provides actionable strategies for intermittent sensing in competitive networked control scenarios and informs future work on equilibrium sensing strategies under asymmetric information.

Abstract

In this paper, we revisit the two-player continuous-time infinite-horizon linear quadratic differential game problem, where one of the players can sample the state of the system only intermittently due to a sensing constraint while the other player can do so continuously. Under these asymmetric sensing limitations between the players, we analyze the optimal sensing and control strategies for the player at a disadvantage while the other player continues to play its security strategy. We derive an optimal sensor policy within the class of stationary randomized policies. Finally, using simulations, we show that the expected cost accrued by the first player approaches its security level as its sensing limitation is relaxed.
Paper Structure (10 sections, 3 theorems, 35 equations, 3 figures)

This paper contains 10 sections, 3 theorems, 35 equations, 3 figures.

Key Result

Proposition 4.1

aggarwal2023weighted The $\beta$--optimal policy is of the form $\delta_{k} = \mathbb{I}[\Delta_k \geq \eta_{_{\beta, \lambda}}],$ for some $\eta_{_{\beta, \lambda}} \ge 0$.

Figures (3)

  • Figure 1: Variation of the closed-loop cost $\bar{J}^h(\mu^S_r)$ vs $h$.
  • Figure 2: Variation of the closed-loop cost $\bar{J}^h(\mu^S_r)$ vs $b$.
  • Figure 3: Time evolution of the closed-loop state $x(t)$, estimation error $e(t)$ of player 1 and the control inputs $u_1(t),u_2(t)$ for both the players.

Theorems & Definitions (5)

  • Remark 2.1
  • Proposition 4.1
  • Theorem 1
  • proof
  • Proposition 4.2