Distributionally Robust Inverse Reinforcement Learning for Identifying Multi-Agent Coordinated Sensing

Luke Snow; Vikram Krishnamurthy

Distributionally Robust Inverse Reinforcement Learning for Identifying Multi-Agent Coordinated Sensing

Luke Snow, Vikram Krishnamurthy

TL;DR

A minimax distributionally robust inverse reinforcement learning (IRL) algorithm is derived to reconstruct the utility functions of a multi-agent sensing system and it is proved the equivalence between this robust estimation and a semi-infinite optimization reformulation.

Abstract

We derive a minimax distributionally robust inverse reinforcement learning (IRL) algorithm to reconstruct the utility functions of a multi-agent sensing system. Specifically, we construct utility estimators which minimize the worst-case prediction error over a Wasserstein ambiguity set centered at noisy signal observations. We prove the equivalence between this robust estimation and a semi-infinite optimization reformulation, and we propose a consistent algorithm to compute solutions. We illustrate the efficacy of this robust IRL scheme in numerical studies to reconstruct the utility functions of a cognitive radar network from observed tracking signals.

Distributionally Robust Inverse Reinforcement Learning for Identifying Multi-Agent Coordinated Sensing

TL;DR

Abstract

Paper Structure (12 sections, 3 theorems, 13 equations, 1 figure, 1 table, 1 algorithm)

This paper contains 12 sections, 3 theorems, 13 equations, 1 figure, 1 table, 1 algorithm.

Introduction
Coordinated Sensing Systems
Coordination Detection and Utility Reconstruction
Main Result I. Robust Utility Estimation
Quantifying the Proximity to Optimality
Utility Reconstruction: Naive Approach
Utility Reconstruction: Robust Approach
Main Result II. IRL Algorithm for Robust Utility Estimation
Semi-Infinite Programming Reformulation
Finite Reduction and Algorithmic Solution
Numerical Example
Conclusions

Key Result

Theorem 1

Let $\mathcal{D}$ be a set of observations. The following are equivalent:

Figures (1)

Figure 1: Average convergence of Algorithm \ref{['alg:dro']} for varying Wasserstein radii $\epsilon$, over 100 Monte-Carlo simulations.

Theorems & Definitions (8)

Definition 1: Multi-agent Bayesian Sensing System
Definition 2: Coordinated Sensing System
Theorem 1
proof
Corollary 1
proof
Theorem 2: Semi-Infinite Reformulation
proof

Distributionally Robust Inverse Reinforcement Learning for Identifying Multi-Agent Coordinated Sensing

TL;DR

Abstract

Distributionally Robust Inverse Reinforcement Learning for Identifying Multi-Agent Coordinated Sensing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (8)