On Swarm Leader Identification using Probing Policies

Stergios E. Bachoumas; Panagiotis Artemiadis

On Swarm Leader Identification using Probing Policies

Stergios E. Bachoumas, Panagiotis Artemiadis

TL;DR

This work addresses the challenge of identifying a swarm leader when direct observation is partial and adversarial probing is required. It introduces iSLI, a POMDP-driven framework trained with PPO on a novel Timed Graph Relationformer (TGR) combined with an S5 encoder, enabling permutation-invariant, temporally aware graph representations. The approach achieves strong zero-shot generalization to varying swarm sizes and speeds, and demonstrates sim-to-real transfer in real robot experiments, including resilience to unexpected observation changes. The key contributions are a graph-based iSLI formulation, a gating-infused TGR architecture, and a Bayesian leader estimation strategy that yields reliable uncertainty quantification for leadership identification. Collectively, the method advances resilient swarm robotics by enabling intelligent adversarial probing to expose and mitigate vulnerabilities in leader-follower dynamics, with practical implications for security and robustness of multi-agent systems.

Abstract

Identifying the leader within a robotic swarm is crucial, especially in adversarial contexts where leader concealment is necessary for mission success. This work introduces the interactive Swarm Leader Identification (iSLI) problem, a novel approach where an adversarial probing agent identifies a swarm's leader by physically interacting with its members. We formulate the iSLI problem as a Partially Observable Markov Decision Process (POMDP) and employ Deep Reinforcement Learning, specifically Proximal Policy Optimization (PPO), to train the prober's policy. The proposed approach utilizes a novel neural network architecture featuring a Timed Graph Relationformer (TGR) layer combined with a Simplified Structured State Space Sequence (S5) model. The TGR layer effectively processes graph-based observations of the swarm, capturing temporal dependencies and fusing relational information using a learned gating mechanism to generate informative representations for policy learning. Extensive simulations demonstrate that our TGR-based model outperforms baseline graph neural network architectures and exhibits significant zero-shot generalization capabilities across varying swarm sizes and speeds different from those used during training. The trained prober achieves high accuracy in identifying the leader, maintaining performance even in out-of-training distribution scenarios, and showing appropriate confidence levels in its predictions. Real-world experiments with physical robots further validate the approach, confirming successful sim-to-real transfer and robustness to dynamic changes, such as unexpected agent disconnections.

On Swarm Leader Identification using Probing Policies

TL;DR

Abstract

On Swarm Leader Identification using Probing Policies

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)