Table of Contents
Fetching ...

Optimal sensor deception in stochastic environments with partial observability to mislead a robot to a decoy goal

Hazhar Rahmani, Mukulika Ghosh, Syed Md Hasnayeen

TL;DR

This work addresses how an adversary can mislead a robot to a decoy goal by offline sensor alterations within a cost budget in a stochastic, partially observable environment modeled as a $POMDP$ with a finite-state controller. The authors prove $OSA extsubscript{DGM}$ is $NP$-hard via a reduction from the $0/1$ Knapsack problem and propose a Mixed Integer Linear Programming (MILP) formulation to synthesize optimal deception strategies, validated through case studies including a reduction instance and a grid-world scenario. The MILP introduces binary alteration variables and linearized Bellman constraints, enabling scalable computation of optimal or near-optimal sensor-deception plans. The results reveal vulnerability patterns in sensor networks and provide a principled method for designing deception strategies under budget constraints, with potential applications in security and adversarial multi-agent contexts.

Abstract

Deception is a common strategy adapted by autonomous systems in adversarial settings. Existing deception methods primarily focus on increasing opacity or misdirecting agents away from their goal or itinerary. In this work, we propose a deception problem aiming to mislead the robot towards a decoy goal through altering sensor events under a constrained budget of alteration. The environment along with the robot's interaction with it is modeled as a Partially Observable Markov Decision Process (POMDP), and the robot's action selection is governed by a Finite State Controller (FSC). Given a constrained budget for sensor event modifications, the objective is to compute a sensor alteration that maximizes the probability of the robot reaching a decoy goal. We establish the computational hardness of the problem by a reduction from the $0/1$ Knapsack problem and propose a Mixed Integer Linear Programming (MILP) formulation to compute optimal deception strategies. We show the efficacy of our MILP formulation via a sequence of experiments.

Optimal sensor deception in stochastic environments with partial observability to mislead a robot to a decoy goal

TL;DR

This work addresses how an adversary can mislead a robot to a decoy goal by offline sensor alterations within a cost budget in a stochastic, partially observable environment modeled as a with a finite-state controller. The authors prove is -hard via a reduction from the Knapsack problem and propose a Mixed Integer Linear Programming (MILP) formulation to synthesize optimal deception strategies, validated through case studies including a reduction instance and a grid-world scenario. The MILP introduces binary alteration variables and linearized Bellman constraints, enabling scalable computation of optimal or near-optimal sensor-deception plans. The results reveal vulnerability patterns in sensor networks and provide a principled method for designing deception strategies under budget constraints, with potential applications in security and adversarial multi-agent contexts.

Abstract

Deception is a common strategy adapted by autonomous systems in adversarial settings. Existing deception methods primarily focus on increasing opacity or misdirecting agents away from their goal or itinerary. In this work, we propose a deception problem aiming to mislead the robot towards a decoy goal through altering sensor events under a constrained budget of alteration. The environment along with the robot's interaction with it is modeled as a Partially Observable Markov Decision Process (POMDP), and the robot's action selection is governed by a Finite State Controller (FSC). Given a constrained budget for sensor event modifications, the objective is to compute a sensor alteration that maximizes the probability of the robot reaching a decoy goal. We establish the computational hardness of the problem by a reduction from the Knapsack problem and propose a Mixed Integer Linear Programming (MILP) formulation to compute optimal deception strategies. We show the efficacy of our MILP formulation via a sequence of experiments.

Paper Structure

This paper contains 10 sections, 4 theorems, 11 equations, 4 figures.

Key Result

Lemma 1

OSA_DGM-DEC$\in \NP\xspace$.

Figures (4)

  • Figure 1: An example of sensor deception. The agent at bottom left cell is tasked to navigate to the goal at the top right cell. Because of photoelasticity in the robot's dynamic, the current state, the position of the robot, is not observable to the robot. The range sensors provide partial observability. The system will alter the sensors to mislead the agent to the decoy goal state in the middle cell containing the security.
  • Figure 2: (a) An instance of the 0/1 knapsack problem. There are $5$ items with weights $W = [1, 2, 3, 4, 5]$ and values $V = [20, 30, 40, 50, 60]$. The capacity of the knapsack is $7$ (b) Optimal solution to the instance of the 0/1 knapsack problem. The knapsack's total weight is 7, and the total value is 100. (c) The POMDP of the instance of our problem, the OSA_DGM-DEC problem, constructed by our reduction for the instance of the 0/1-KNAPSACK-DEC in Part (a) of this figure. The solid edges are transitions that take place with action $a$ and the dashed arrows are transitions for action $b$. All the transitions missing probability labels, use probability $1$. We omitted those labels to reduce visual clutter. States $s_{\bot}$ and $s_{\top}$ are absorbing states. Their outgoing transitions are omitted to reduce visual clutter. (d) The finite-state controller of the instance of our problem constructed by our reduction for the instance of the 0/1-KNAPSACK-DEC in Part (a) of this figure. All the missing transitions enters $n_2$ and choose action $b$.
  • Figure 3: Top-left) A grid environment guarded by $7$ range sensors $s_0$ through $s_7$. The robot is tasked to deliver an item from the starting location $(0, 0)$ to the goal location, $(4, 4)$. Cell $(2, 2)$ is hazardous and must be avoided. That cell considered a decoy and the attacker's purpose is to mislead the robot to that cell. Top-right) A finite-state controller the robot uses. Bottom-left) The robot's dynamic when it performs action N, standing for going to North. Bottom-right) The robot's dynamic when it performs action E, standing for going to East.
  • Figure 4: Results of our scalability experiment for grids similar to the grid in Figure \ref{['fig:grid_1robot']}a of size $n \times n$, $n \in \{5, 15, 25, 35, 45\}$. Note that for the top two graphs, the values on the y-axis are in millions (e.g., the instance for $n = 45$ has more than $1.6$ million variables and more than $6.2$ million constraints).

Theorems & Definitions (7)

  • Lemma 1
  • proof
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Corollary 1