Table of Contents
Fetching ...

Deception in Linear-Quadratic Control

Yerin Kim, Haosheng Zhou, Alexander Benvenuti, Ruimeng Hu, Matthew Hale

Abstract

Systems operating in adversarial environments may inadvertently leak sensitive information to adversaries. To address this challenge, we revisit the linear-quadratic control framework and introduce deception to actively mislead adversaries. Specifically, we consider a blue-team agent, observed by a red-team agent, that seeks to minimize a quadratic cost while introducing perturbations to its trajectories over time. These perturbations are designed to corrupt the red team's observations and, consequently, any downstream inferences, while remaining undetected by a red team using sequential hypothesis testing. We implement this idea by augmenting the blue team's quadratic cost with a likelihood ratio statistic. Under this augmented control problem, we derive a semi-explicit solution for the optimal deceptive control law and establish corresponding well-posedness results. In addition, we provide both numerical approximations and analytical bounds for the probability that the red team detects the blue team's deceptive strategies. Numerical results demonstrate the effectiveness of the proposed framework in deceiving the red team while remaining undetected with probability near 1.

Deception in Linear-Quadratic Control

Abstract

Systems operating in adversarial environments may inadvertently leak sensitive information to adversaries. To address this challenge, we revisit the linear-quadratic control framework and introduce deception to actively mislead adversaries. Specifically, we consider a blue-team agent, observed by a red-team agent, that seeks to minimize a quadratic cost while introducing perturbations to its trajectories over time. These perturbations are designed to corrupt the red team's observations and, consequently, any downstream inferences, while remaining undetected by a red team using sequential hypothesis testing. We implement this idea by augmenting the blue team's quadratic cost with a likelihood ratio statistic. Under this augmented control problem, we derive a semi-explicit solution for the optimal deceptive control law and establish corresponding well-posedness results. In addition, we provide both numerical approximations and analytical bounds for the probability that the red team detects the blue team's deceptive strategies. Numerical results demonstrate the effectiveness of the proposed framework in deceiving the red team while remaining undetected with probability near 1.

Paper Structure

This paper contains 23 sections, 8 theorems, 62 equations, 4 figures.

Key Result

Lemma 1

In SHT, a "type I error" refers to rejecting $\mathsf{H}_0$ when $\mathsf{H}_0$ is true, and a "type II error" refers to accepting $\mathsf{H}_0$ when $\mathsf{H}_1$ is true. Suppose one sets for $a,b\in(0,1/2)$ and implements the SPRT according to the decision rule in Definition def:sprt. Then the probabilities of type I and II errors are respectively bounded by $a$ and $b$, i.e., $\mathbb{P}_{\

Figures (4)

  • Figure 3: Comparison of trajectories, primary costs, and deception measures under different values of $\lambda$. Trajectories are generated by identical realizations of $\{\mathbf{w}_t\}_{t\in[T-1]}$.
  • Figure 4: Approximations, $95\%$-confidence intervals, and analytical bounds for $\mathbb{P}(\log L^*_t\geq\log U)$ in the stealthiness constraint \ref{['eq:stealthiness_condition']}. In the left panel, solid lines represent sampling-based approximations (computed based on $20,000$ samples), and colored areas represent $95\%$-confidence intervals \ref{['eq:score_confidence_interval']}.
  • Figure 5: Comparison of trajectories generated by $\lambda = 0$ and $\lambda = 0.04$. Blue lines represent mean baseline trajectories; red lines represent mean trajectories under $\lambda = 0.04$; gray lines represent $500$ independent trajectories under $\lambda = 0.04$.
  • Figure 6: Trade-off among detection probability, deception measure, and primary cost under different detection tolerances $\varepsilon$. Here, $\lambda$ is chosen as the largest value (identified via the sampling-based approach) that satisfies the stealthiness constraint \ref{['eq:stealthiness_condition']}.

Theorems & Definitions (15)

  • Definition 1: Sequential Probability Ratio Test (SPRT) wald1992sequential
  • Lemma 1: SPRT Thresholds wald1992sequential
  • Remark 1
  • Theorem 1: Global well-posedness
  • proof
  • Corollary 1
  • proof
  • Lemma 2: Score Confidence Interval agresti1998approximate
  • Lemma 3
  • proof
  • ...and 5 more