Detecting and Diagnosing Faults in Autonomous Robot Swarms with an Artificial Antibody Population Model
James O'Keeffe
TL;DR
The paper tackles fault tolerance in autonomous swarms facing gradual hardware degradation, arguing that conventional methods tuned for sudden, minority faults are insufficient for long-term autonomy. It introduces the Artificial Antibody Population Dynamics (AAPD), an immune-inspired, distributed FDDR framework that encodes robot behavior into paratopes and uses matching and antibody population dynamics to detect faults and diagnose their hardware class. It demonstrates that AAPD can jointly detect gradual degradation and diagnose motor versus sensor faults, achieving high true-positive rates with minimal false positives, and maintaining 70–97% of optimum swarm performance across diverse scenarios, including up to $N=20$ robots, while preventing field failures in many cases. The work presents a scalable, configurable approach with supervised and unsupervised options and suggests broad applicability to other multi-agent systems and real-world deployments, alongside directions for future improvements and real-hardware validation.
Abstract
An active approach to fault tolerance, the combined processes of fault detection, diagnosis, and recovery, is essential for long term autonomy in robots -- particularly multi-robot systems and swarms. Previous efforts have primarily focussed on spontaneously occurring electro-mechanical failures in the sensors and actuators of a minority sub-population of robots. While the systems that enable this function are valuable, they have not yet considered that many failures arise from gradual wear and tear with continued operation, and that this may be more challenging to detect than sudden step changes in performance. This paper presents the Artificial Antibody Population Dynamics (AAPD) model -- an immune-inspired model for the detection and diagnosis of gradual degradation in robot swarms. The AAPD model is demonstrated to reliably detect and diagnose gradual degradation, as well as spontaneous changes in performance, among swarms of robots of varying sizes while remaining tolerant of normally behaving robots. The AAPD model is distributed, offers supervised and unsupervised configurations, and demonstrates promising scalable properties. Deploying the AAPD model on a swarm of foraging robots undergoing gradual degradation enables the swarm to operate on average at between 70% - 97% of its performance in perfect conditions and is able to prevent instances of robots failing in the field during experiments in most of the cases tested.
