Partially Observable Stochastic Games with Neural Perception Mechanisms

Rui Yan; Gabriel Santos; Gethin Norman; David Parker; Marta Kwiatkowska

Partially Observable Stochastic Games with Neural Perception Mechanisms

Rui Yan, Gabriel Santos, Gethin Norman, David Parker, Marta Kwiatkowska

TL;DR

Partially Observable Stochastic Games with Neural Perception Mechanisms address multi-agent decision making under partial information where perception is provided by neural classifiers. The authors define one-sided NS-POSGs, prove continuity and convexity of the value function $V^ op$, and show it admits a finite polyhedral, piecewise-linear-convex representation under mild assumptions. They introduce one-sided NS-HSVI, a heuristic search value iteration algorithm that uses PPWLC representations and NN pre-image-based polyhedra, with particle beliefs and LP-based backups. Empirical studies on pedestrian-vehicle and pursuit-evasion scenarios demonstrate the method's ability to synthesize strategies and to analyze how perception precision impacts safety and performance.

Abstract

Stochastic games are a well established model for multi-agent sequential decision making under uncertainty. In practical applications, though, agents often have only partial observability of their environment. Furthermore, agents increasingly perceive their environment using data-driven approaches such as neural networks trained on continuous data. We propose the model of neuro-symbolic partially-observable stochastic games (NS-POSGs), a variant of continuous-space concurrent stochastic games that explicitly incorporates neural perception mechanisms. We focus on a one-sided setting with a partially-informed agent using discrete, data-driven observations and another, fully-informed agent. We present a new method, called one-sided NS-HSVI, for approximate solution of one-sided NS-POSGs, which exploits the piecewise constant structure of the model. Using neural network pre-image analysis to construct finite polyhedral representations and particle-based representations for beliefs, we implement our approach and illustrate its practical applicability to the analysis of pedestrian-vehicle and pursuit-evasion scenarios.

Partially Observable Stochastic Games with Neural Perception Mechanisms

TL;DR

, and show it admits a finite polyhedral, piecewise-linear-convex representation under mild assumptions. They introduce one-sided NS-HSVI, a heuristic search value iteration algorithm that uses PPWLC representations and NN pre-image-based polyhedra, with particle beliefs and LP-based backups. Empirical studies on pedestrian-vehicle and pursuit-evasion scenarios demonstrate the method's ability to synthesize strategies and to analyze how perception precision impacts safety and performance.

Abstract

Paper Structure (11 sections, 8 theorems, 6 equations, 3 figures, 2 algorithms)

This paper contains 11 sections, 8 theorems, 6 equations, 3 figures, 2 algorithms.

Introduction
Background
One-Sided Neuro-Symbolic POSGs
Values of One-Sided NS-POSGs
P-PWLC Value Iteration
Heuristic Search Value Iteration for NS-POSGs
Lower and Upper Bound Representations
One-Sided NS-HSVI
Belief Representation and Computations
Experimental Evaluation
Conclusions

Key Result

theorem 1

For $s_1 \in S_1$, $V^{\star}(s_1, \cdot) : \mathbb{P}(S_E) \to \mathbb{R}$ is convex and continuous, and for $b_1, b_1' \in \mathbb{P}(S_E):$$|V^{\star}(s_1, b_1) - V^{\star}(s_1, b_1')| \leq K(b_1, b_1')$ where $K(b_1, b_1') = \frac{1}{2} (U - L) \hbox{$\int_{s_E \in S_E^{s_1}}$} | b_1(s_E) - b_1

Figures (3)

Figure 1: Pedestrian-vehicle example. Left: Positions of two agents. Middle: Sample images from the PIE dataset AR-IK-TK-JKT:19. Right: Slices of learnt perception function, where $(x_1,y_1),(x_2,y_2)$ are two successive (relative) positions of the pedestrian.
Figure 2: Simulations of strategies for the pursuer, showing actual location (red), perceived location (blue), belief of evader location (green) and strategy (pink) for two different NN perception functions: (a) more precise; (b) coarser.
Figure 3: Simulations of strategies for the vehicle, plotted as the pedestrian's current position $(x_2,y_2)$ relative to it. Also shown: perceived pedestrian intention (green/yellow/red = unlikely/likely/very likely to cross), current speed (orange), acceleration (black) and crash region (shaded purple region).

Theorems & Definitions (14)

Definition 1: NS-POSG
Definition 2: Semantics
theorem 1: Convexity and continuity
Definition 3: Minimax
Definition 4: Maxsup
theorem 2: Operator equivalence and fixed point
Definition 5: PWC function
Definition 6: P-PWLC function
Lemma 1: LP for minimax and P-PWLC
theorem 3: P-PWLC closure and convergence
...and 4 more

Partially Observable Stochastic Games with Neural Perception Mechanisms

TL;DR

Abstract

Partially Observable Stochastic Games with Neural Perception Mechanisms

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (14)