Shaping Multi-Robot Patrol Performance with Heterogeneity in Individual Learning Behavior

Connor York; Zachary R Madin; Paul O'Dowd; Edmund R Hunt

Shaping Multi-Robot Patrol Performance with Heterogeneity in Individual Learning Behavior

Connor York, Zachary R Madin, Paul O'Dowd, Edmund R Hunt

TL;DR

This work examines how latent inhibition (LI) heterogeneity among patrolling robots shapes collective performance in dynamic environments. It introduces a simple LI-based re-scan rule, $p_r = 1 - \text{LI}$, and evaluates homogeneous and heterogeneous LI compositions under static and changing reward patterns, with and without inter-robot communication. Across simulations and experiments, a negatively skewed LI distribution—mostly high-LI robots with a single low-LI agent—often delivers the best performance in dynamic environments, especially when communication spreads newly learned information. The findings demonstrate functional heterogeneity as a promising design principle for swarm robotics, enabling adaptive exploration–exploitation trade-offs and more robust anomaly detection in changing environments.

Abstract

Individual differences in learning behavior within social groups, whether in humans, other animals, or among robots, can have significant effects on collective task performance. This is because it can affect individuals' response to the environment and their interactions with each other. In recent years there has been rising interest in the question of how individual differences, whether in learning or other traits, affect collective outcomes: studied, for example, in social insect foraging behavior. Multi-robot, 'swarm' systems have a heritage of bioinspiration from such examples, and here we consider whether heterogeneity in a learning behavior called latent inhibition (LI) may be useful for a team of patrolling robots tasked with environmental monitoring and anomaly detection. Individuals with high LI can be seen as better at learning to be inattentive to irrelevant or unrewarding stimuli, while low LI individuals might be seen as 'distractible' and yet, more positively, more exploratory. We introduce a simple model of the effects of LI as the probability of re-searching a location for a reward (anomalous reading) where it has previously been found to be unrewarding (irrelevant). In simulated patrols, we find that a negatively skewed distribution of mostly high LI robots, and just a single low LI robot, is collectively most effective at monitoring dynamic environments. These results are an example of 'functional heterogeneity' in 'swarm engineering' and could inform predictions for ecological distributions of learning traits within social groups.

Shaping Multi-Robot Patrol Performance with Heterogeneity in Individual Learning Behavior

TL;DR

This work examines how latent inhibition (LI) heterogeneity among patrolling robots shapes collective performance in dynamic environments. It introduces a simple LI-based re-scan rule,

, and evaluates homogeneous and heterogeneous LI compositions under static and changing reward patterns, with and without inter-robot communication. Across simulations and experiments, a negatively skewed LI distribution—mostly high-LI robots with a single low-LI agent—often delivers the best performance in dynamic environments, especially when communication spreads newly learned information. The findings demonstrate functional heterogeneity as a promising design principle for swarm robotics, enabling adaptive exploration–exploitation trade-offs and more robust anomaly detection in changing environments.

Abstract

Paper Structure (10 sections, 1 equation, 3 figures)

This paper contains 10 sections, 1 equation, 3 figures.

Introduction
Background
Latent Inhibition and Attention to Relevant Information
Environmental Monitoring and Multi-Robot Patrol
Methods
Latent inhibition model and patrol strategy
Agent-based robot simulations
Experimental trials
Results
Discussion

Figures (3)

Figure 1: Map and patrol route (dark green) used during the simulations. Stars represent the robots' starting positions, and circles represent the nodes. Colored lines show the robots' paths onto the patrol route for $N=6$. The map was obtained from 2D lidar mapping of our own office environment.
Figure 2: Average system reward for $N\in\{1,2,4,6\}$ robots by group LI composition. White is communication disabled, blue is communication enabled. Top row: static reward environment. Middle row: 1 environment shift. Bottom row: 3 environment shifts.
Figure 3: Average system reward over time for 6 robots. Top: Static reward environment, communication enabled. Middle: 3 environment shifts, communication disabled. Bottom: 3 environment shifts, communication enabled.

Shaping Multi-Robot Patrol Performance with Heterogeneity in Individual Learning Behavior

TL;DR

Abstract

Shaping Multi-Robot Patrol Performance with Heterogeneity in Individual Learning Behavior

Authors

TL;DR

Abstract

Table of Contents

Figures (3)