Table of Contents
Fetching ...

Detecting and Diagnosing Faults in Autonomous Robot Swarms with an Artificial Antibody Population Model

James O'Keeffe

TL;DR

The paper tackles fault tolerance in autonomous swarms facing gradual hardware degradation, arguing that conventional methods tuned for sudden, minority faults are insufficient for long-term autonomy. It introduces the Artificial Antibody Population Dynamics (AAPD), an immune-inspired, distributed FDDR framework that encodes robot behavior into paratopes and uses matching and antibody population dynamics to detect faults and diagnose their hardware class. It demonstrates that AAPD can jointly detect gradual degradation and diagnose motor versus sensor faults, achieving high true-positive rates with minimal false positives, and maintaining 70–97% of optimum swarm performance across diverse scenarios, including up to $N=20$ robots, while preventing field failures in many cases. The work presents a scalable, configurable approach with supervised and unsupervised options and suggests broad applicability to other multi-agent systems and real-world deployments, alongside directions for future improvements and real-hardware validation.

Abstract

An active approach to fault tolerance, the combined processes of fault detection, diagnosis, and recovery, is essential for long term autonomy in robots -- particularly multi-robot systems and swarms. Previous efforts have primarily focussed on spontaneously occurring electro-mechanical failures in the sensors and actuators of a minority sub-population of robots. While the systems that enable this function are valuable, they have not yet considered that many failures arise from gradual wear and tear with continued operation, and that this may be more challenging to detect than sudden step changes in performance. This paper presents the Artificial Antibody Population Dynamics (AAPD) model -- an immune-inspired model for the detection and diagnosis of gradual degradation in robot swarms. The AAPD model is demonstrated to reliably detect and diagnose gradual degradation, as well as spontaneous changes in performance, among swarms of robots of varying sizes while remaining tolerant of normally behaving robots. The AAPD model is distributed, offers supervised and unsupervised configurations, and demonstrates promising scalable properties. Deploying the AAPD model on a swarm of foraging robots undergoing gradual degradation enables the swarm to operate on average at between 70% - 97% of its performance in perfect conditions and is able to prevent instances of robots failing in the field during experiments in most of the cases tested.

Detecting and Diagnosing Faults in Autonomous Robot Swarms with an Artificial Antibody Population Model

TL;DR

The paper tackles fault tolerance in autonomous swarms facing gradual hardware degradation, arguing that conventional methods tuned for sudden, minority faults are insufficient for long-term autonomy. It introduces the Artificial Antibody Population Dynamics (AAPD), an immune-inspired, distributed FDDR framework that encodes robot behavior into paratopes and uses matching and antibody population dynamics to detect faults and diagnose their hardware class. It demonstrates that AAPD can jointly detect gradual degradation and diagnose motor versus sensor faults, achieving high true-positive rates with minimal false positives, and maintaining 70–97% of optimum swarm performance across diverse scenarios, including up to robots, while preventing field failures in many cases. The work presents a scalable, configurable approach with supervised and unsupervised options and suggests broad applicability to other multi-agent systems and real-world deployments, alongside directions for future improvements and real-hardware validation.

Abstract

An active approach to fault tolerance, the combined processes of fault detection, diagnosis, and recovery, is essential for long term autonomy in robots -- particularly multi-robot systems and swarms. Previous efforts have primarily focussed on spontaneously occurring electro-mechanical failures in the sensors and actuators of a minority sub-population of robots. While the systems that enable this function are valuable, they have not yet considered that many failures arise from gradual wear and tear with continued operation, and that this may be more challenging to detect than sudden step changes in performance. This paper presents the Artificial Antibody Population Dynamics (AAPD) model -- an immune-inspired model for the detection and diagnosis of gradual degradation in robot swarms. The AAPD model is demonstrated to reliably detect and diagnose gradual degradation, as well as spontaneous changes in performance, among swarms of robots of varying sizes while remaining tolerant of normally behaving robots. The AAPD model is distributed, offers supervised and unsupervised configurations, and demonstrates promising scalable properties. Deploying the AAPD model on a swarm of foraging robots undergoing gradual degradation enables the swarm to operate on average at between 70% - 97% of its performance in perfect conditions and is able to prevent instances of robots failing in the field during experiments in most of the cases tested.
Paper Structure (4 sections, 6 equations, 14 figures, 3 tables, 2 algorithms)

This paper contains 4 sections, 6 equations, 14 figures, 3 tables, 2 algorithms.

Figures (14)

  • Figure 1: The AAPD-model runs on a TurtleBot3 SRS (robots $R_{1-3}$ shown). Temporal samples of robot state and sensor data (linear velocity shown) are encoded in artificial antibodies, for which each robot has it's own repertoire (${X}_{1-3}$). $R_1$’s artificial antibody $x_b$ has a high matching specificity, $m$, with antibodies $x_a$ and $x_c$ according to \ref{['equ:ok_spec']}, resulting in stimulation of population $x_b$. The same is true for $R_2$'s antibody $x_e$ with $x_d$ and $x_f$. The high matching specificity between the artificial antibodies of $R_1$ and $R_2$ ($x_b$ with $x_d$, $x_e$, and $x_f$, and $x_e$ with $x_a$, $x_b$, and $x_c$) results in the mutual suppression of populations of $x_b$ and $x_e$ such that they are tolerated as normal. $R_3$’s artificial antibody $x_h$ has a high matching specificity with artificial antibodies $x_g$ and $x_i$, resulting in population stimulation of $x_h$, however $x_h$ has a low matching specificity with the antibodies of $R_1$ and $R_2$ because of the high residuals when they are convolved with \ref{['equ:ok_spec']}, meaning that the population of $x_h$ is not suppressed. $x_h$ is further stimulated by its high matching specificity with paratope $y_2$, contained in $Y$, which further stimulates the population of $x_h$, taking it over the threshold $f$ for detection of $R_3$ as faulty.
  • Figure 2: Experimental setup for 10 robots performing \ref{['alg_fo']} or \ref{['alg_fo_2']} in an enclosed empty environment (A) or constrained environment (B). Resource nests are indicated by the three grey circles opposite the robots. The highlighted green area indicates the robot base.
  • Figure 3: Velocity and power consumption of left or right wheels, $v_{l,r}$ and $\Delta P_{l,r}$, respectively, normalised and plotted against degradation severity coefficients $d_{l,r}$ according to \ref{['equ:velocity']} and \ref{['equ:power_wheels']} (left plot). Robot sensing range $r$ plotted against degradation severity coefficient $d_S$ according to \ref{['equ:range_2']} (right plot).
  • Figure 4: The construction of artificial antibody paratopes for robot motors, $p_M$, and sensors, $p_S$. $p_M$ has a 3 dimensional paratope consisting of 5 second recordings ($l = 30$) of robot linear velocity, $v$, angular velocity, $\omega$, and rate of power consumption, $\Delta P$. $p_S$ has a 1 dimensional paratope consisting of a 5 second recording of $\gamma$.
  • Figure 5: The resources collected and power consumed in 15 minutes by a SRS of $N = 10$ robots performing \ref{['alg_fo']}. Maintenance is scheduled when a robot has any value $d_{l,r,S} < d_0$. Data presented is normalised to a common y axis.
  • ...and 9 more figures