Table of Contents
Fetching ...

When the Correct Model Fails: The Optimality of Stackelberg Equilibria with Follower Intention Updates

Cayetana Salinas-Rodriguez, Jonathan Rogers, Sarah H. Q. Li

TL;DR

This work examines Stackelberg dynamic games where the leader does not know the follower's best response and can update its belief during the horizon. It develops both open-loop and feedback formulations under LTI dynamics and analyzes how BR belief updates (two beliefs $b^1$ and $b^2$ at update time $\tau$) affect equilibrium optimality, showing that the true BR does not always minimize the total cost due to time-inconsistency in OL and potential non-Markov perfection in FB. The contributions include a sufficient condition for OLSE optimality with BR updates, a discussion of MPFSE for FB updates, and numerical LQ simulations with Bayesian BR estimation that reveal nontrivial advantages to incorrect BR beliefs in certain regimes, including collision-avoidance scenarios. The results have practical implications for designing adaptive, interactive autonomous systems where intention estimation and belief updates must be balanced against potential cost trade-offs and time-consistency considerations.

Abstract

We study a two-player dynamic Stackelberg game between a leader and a follower whose intention is unknown to the leader. Classical formulations of the Stackelberg equilibrium (SE) assume that the follower's best response (BR) function is known to the leader. However, this is not always true in practice. We study a setting in which the leader receives updated beliefs about the follower BR before the end of the game, such that the update prompts the leader and subsequently the follower to re-optimize their strategies. We characterize the optimality guarantees of the SE solutions under this belief update for both open loop and feedback information structures. Interestingly, we prove that in general, assuming an incorrect follower's BR can lead to more optimal leader costs over the entire game than knowing the true follower's BR. We support these results with numerical examples in a linear quadratic (LQ) Stackelberg game, and use Monte Carlo simulations to show that the instances of incorrect BR achieving lower leader costs are non-trivial in collision avoidance LQ Stackelberg games.

When the Correct Model Fails: The Optimality of Stackelberg Equilibria with Follower Intention Updates

TL;DR

This work examines Stackelberg dynamic games where the leader does not know the follower's best response and can update its belief during the horizon. It develops both open-loop and feedback formulations under LTI dynamics and analyzes how BR belief updates (two beliefs and at update time ) affect equilibrium optimality, showing that the true BR does not always minimize the total cost due to time-inconsistency in OL and potential non-Markov perfection in FB. The contributions include a sufficient condition for OLSE optimality with BR updates, a discussion of MPFSE for FB updates, and numerical LQ simulations with Bayesian BR estimation that reveal nontrivial advantages to incorrect BR beliefs in certain regimes, including collision-avoidance scenarios. The results have practical implications for designing adaptive, interactive autonomous systems where intention estimation and belief updates must be balanced against potential cost trade-offs and time-consistency considerations.

Abstract

We study a two-player dynamic Stackelberg game between a leader and a follower whose intention is unknown to the leader. Classical formulations of the Stackelberg equilibrium (SE) assume that the follower's best response (BR) function is known to the leader. However, this is not always true in practice. We study a setting in which the leader receives updated beliefs about the follower BR before the end of the game, such that the update prompts the leader and subsequently the follower to re-optimize their strategies. We characterize the optimality guarantees of the SE solutions under this belief update for both open loop and feedback information structures. Interestingly, we prove that in general, assuming an incorrect follower's BR can lead to more optimal leader costs over the entire game than knowing the true follower's BR. We support these results with numerical examples in a linear quadratic (LQ) Stackelberg game, and use Monte Carlo simulations to show that the instances of incorrect BR achieving lower leader costs are non-trivial in collision avoidance LQ Stackelberg games.

Paper Structure

This paper contains 20 sections, 2 theorems, 52 equations, 6 figures.

Key Result

Proposition 1

Consider the state trajectories $x^1_{0:\tau}$ and $x^\star_{0:\tau}$eq:pre_update_states when the leader has BR belief $b^1$ and $b^\star$, respectively, prior to BR update at $\tau < T$ in the OL information setting eq:information_structures. If the states $x^1_{\tau} = x^\star_{\tau}$. Then,

Figures (6)

  • Figure 1: Dynamics of Stackelberg game with BR update.
  • Figure 2: Percent of simulations with lowest cost achieved by each BR belief under OL information structure when $\tau = 1$.
  • Figure 3: Percentage of simulations where each BR belief obtains the lowest cost for $\tau = 1, 2, 5, 10, 20$ under OL information structure.
  • Figure 4: Percent of simulations with lowest cost achieved by BR beliefs under FB information structure and $\tau=1$.
  • Figure 5: Percentage of simulations where each BR belief obtains the lowest cost for $\tau = 1, 2, 5, 10, 20$ under FB information structure.
  • ...and 1 more figures

Theorems & Definitions (10)

  • Definition 1: Stackelberg Dynamic Game
  • Definition 2: SE Time-inconsistency basar_time_1989
  • Proposition 1
  • proof
  • Remark 1
  • Example 1
  • Definition 3: Markov Perfect FSE (MPFSE) fudenberg1991game
  • Proposition 2
  • proof
  • Example 2