A Model for Multi-Agent Heterogeneous Interaction Problems

Christopher D. Hsu; Mulugeta A. Haile; Pratik Chaudhari

A Model for Multi-Agent Heterogeneous Interaction Problems

Christopher D. Hsu, Mulugeta A. Haile, Pratik Chaudhari

TL;DR

A model for multi-agent interaction problems to understand how a heterogeneous team of agents should organize its resources to tackle a heterogeneous team of attackers and shows how the defender team can optimally counteract a heterogeneous attacker team using very few types of defender agents, and thereby minimize its resources.

Abstract

We introduce a model for multi-agent interaction problems to understand how a heterogeneous team of agents should organize its resources to tackle a heterogeneous team of attackers. This model is inspired by how the human immune system tackles a diverse set of pathogens. The key property of this model is a ``cross-reactivity'' kernel which enables a particular defender type to respond strongly to some attacker types but weakly to a few different types of attackers. We show how due to such cross-reactivity, the defender team can optimally counteract a heterogeneous attacker team using very few types of defender agents, and thereby minimize its resources. We study this model in different settings to characterize a set of guiding principles for control problems with heterogeneous teams of agents, e.g., sensitivity of the harm to sub-optimal defender distributions, and competition between defenders gives near-optimal behavior using decentralized computation of the control. We also compare this model with existing approaches including reinforcement-learned policies, perimeter defense, and coverage control.

A Model for Multi-Agent Heterogeneous Interaction Problems

TL;DR

Abstract

Paper Structure (21 sections, 8 equations, 7 figures, 2 tables)

This paper contains 21 sections, 8 equations, 7 figures, 2 tables.

Introduction
Contributions
Related Work
Problem formulation
The model for interactions between attackers and defenders
Shape/State space
Composition of the team
Interactions between attackers and defenders of different types
Harm caused by an attacker type
Minimizing the harm
Numerical simulations of the model
Simulating interaction episodes between attackers and defenders
The optimal defender team has a finite number of defender types
Gaussian attacker and defender distributions
Non-Gaussian attacker and defender distributions
...and 6 more sections

Figures (7)

Figure 1: Orange defenders from distribution $P_d$ successfully interact with blue attackers from distribution $Q_a$ with probability $f_{d,a}$ which depends on the defender type $d$ and the attacker type $a$.
Figure 2: A simulation of $\sum_a N_a=100$ attackers sampled from a Gaussian $Q_a$ with $\sigma_Q=0.1$ interacts with $\sum_d N_d=100$ defenders sampled from $P_d^*$ for different values of $\sigma_P$ in a shape space $x \in [0,1]$ with $N=50$ types ($\Delta x = 0.02$). Left: For $\alpha=1$, when $\sigma_P \geq \sigma_Q\sqrt{2}$ the optimal $P_d^*$ which is a Gaussian tends towards a Dirac delta distribution at the origin. Right: In the simulation, as we increase $\sigma_P$ beyond $\sigma_Q \sqrt{2}$, the empirical harm, i.e., the average unsuccessful interactions until all attackers are recognized, decreases (blue). The number of distinct defender types also decreases (orange).
Figure 3: Left: Simulation of the interaction of defenders (orange) with attackers from distribution $Q_a$ (blue). The optimal defender distribution $P_d^*$ (orange) is found by optimizing \ref{['eq:harm']}. Cross-reactivity $f_{d,a}$ with bandwidth $\sigma = 0.05$ in a state $x$ with $N=200$ types ($\Delta x = 0.005$) leads to a discrete distribution. The harm $Q_a \bar{F}_a$ caused by attackers of different types (green) is uniform across the domain. Right: The harm incurred using a non-optimal $P_d$ increases as the difference measured by the Wasserstein distance between the probability $P_d$ and probability $P_d^*$ increases. To obtain this plot, we sampled 1000 different $P_d$s (by perturbing the optimal $P_d^*$ using log-normal noise) and computed the empirical and analytical harm against a fixed $Q_a$. This also indicates that the analytical harm \ref{['eq:Fbar*']} is close to the mean of the empirical harm over 100 episodes of our experiments using \ref{['eq:Fbar']} for a broad regime.
Figure 4: Convergence to near-optimal harm with competition dynamics. We run the population dynamics in \ref{['eq:competition']} to calculate the defender distribution $P_d(t)$ starting from a uniform $P_d(0)$. On the left, we compare the optimal defender distribution (blue) calculated using \ref{['eq:harm']} for a known $Q_a$ ($Q_a$ is the same log-normal distribution sampled in \ref{['fig:case0']}) with the defender distribution calculated using the competition dynamics (blue) and an estimated $\hat{Q}_a$ from attacker-defender interactions. On the right, we show how the empirical harm (orange batched boxplot) incurred by the competition dynamics distribution $P_d(\hat{Q}_a)$ converges towards the analytical harm (blue) and standard deviation shrinks as time progresses. For this experiment, the dynamics were run for $10^4$ iterations per episode, with time in between interactions $\Delta t = 0.2c^{-1}$, decommission rate $c=0.001$, cross-reactivity bandwidth $\sigma = 0.05$, and $N=200$ types in the shape space.
Figure 5: Defender distribution $P_d$ learned by SAC at the end of episode after competing for interactions with attackers sampled from $Q_a$ (the optimal $P_d^*$ is in blue and the same $Q_a$ as \ref{['fig:case0']}). We sampled $\sum_d N_d=100$ agents from a uniform distribution that shift states to perform recognition. On the right we compare the test harm of the defender distribution $P_d$ learned by SAC over training epochs to the optimal harm ($10^4$ iterations per epoch for a total of $10^6$ interactions). Cross reactivity bandwidth is $\sigma = 0.05$ and there are $N=200$ types.
...and 2 more figures

A Model for Multi-Agent Heterogeneous Interaction Problems

TL;DR

Abstract

A Model for Multi-Agent Heterogeneous Interaction Problems

Authors

TL;DR

Abstract

Table of Contents

Figures (7)