Table of Contents
Fetching ...

Towards Physically Talented Aerial Robots with Tactically Smart Swarm Behavior thereof: An Efficient Co-design Approach

Prajit KrisshnaKumar, Steve Paul, Hemanth Manjunatha, Mary Corra, Ehsan Esfahani, Souma Chowdhury

TL;DR

This paper explores how morphology impacts the learned tactical behavior of unmanned aerial/ground robots performing reconnaissance and search&rescue by presenting a computationally efficient framework to solve this otherwise challenging problem of jointly optimizing the morphology and tactical behavior of swarm robots.

Abstract

The collective performance or capacity of collaborative autonomous systems such as a swarm of robots is jointly influenced by the morphology and the behavior of individual systems in that collective. In that context, this paper explores how morphology impacts the learned tactical behavior of unmanned aerial/ground robots performing reconnaissance and search & rescue. This is achieved by presenting a computationally efficient framework to solve this otherwise challenging problem of jointly optimizing the morphology and tactical behavior of swarm robots. Key novel developments to this end include the use of physical talent metrics and modification of graph reinforcement learning architectures to allow joint learning of the swarm tactical policy and the talent metrics (search speed, flight range, and cruising speed) that constrain mobility and object/victim search capabilities of the aerial robots executing these tactics. Implementation of this co-design approach is supported by advancements to an open-source Pybullet-based swarm simulator that allows the use of variable aerial asset capabilities. The results of the co-design are observed to outperform those of tactics learning with a fixed Pareto design, when compared in terms of mission performance metrics. Significant differences in morphology and learned behavior are also observed by comparing the baseline design and the co-design outcomes.

Towards Physically Talented Aerial Robots with Tactically Smart Swarm Behavior thereof: An Efficient Co-design Approach

TL;DR

This paper explores how morphology impacts the learned tactical behavior of unmanned aerial/ground robots performing reconnaissance and search&rescue by presenting a computationally efficient framework to solve this otherwise challenging problem of jointly optimizing the morphology and tactical behavior of swarm robots.

Abstract

The collective performance or capacity of collaborative autonomous systems such as a swarm of robots is jointly influenced by the morphology and the behavior of individual systems in that collective. In that context, this paper explores how morphology impacts the learned tactical behavior of unmanned aerial/ground robots performing reconnaissance and search & rescue. This is achieved by presenting a computationally efficient framework to solve this otherwise challenging problem of jointly optimizing the morphology and tactical behavior of swarm robots. Key novel developments to this end include the use of physical talent metrics and modification of graph reinforcement learning architectures to allow joint learning of the swarm tactical policy and the talent metrics (search speed, flight range, and cruising speed) that constrain mobility and object/victim search capabilities of the aerial robots executing these tactics. Implementation of this co-design approach is supported by advancements to an open-source Pybullet-based swarm simulator that allows the use of variable aerial asset capabilities. The results of the co-design are observed to outperform those of tactics learning with a fixed Pareto design, when compared in terms of mission performance metrics. Significant differences in morphology and learned behavior are also observed by comparing the baseline design and the co-design outcomes.
Paper Structure (30 sections, 14 equations, 9 figures, 2 tables, 1 algorithm)

This paper contains 30 sections, 14 equations, 9 figures, 2 tables, 1 algorithm.

Figures (9)

  • Figure 1: Flowchart of our co-design framework, a) Morphology and its dependent talent parameters are derived, b) Based on the talents, we create a Pareto boundary, c) The Talent-infused Actor-critic method is used to train the associated behavior and talents, d) Finalize the morphology for the optimized talents.
  • Figure 2: The GCAPCN-based encoder. The node properties undergo a linear transformation first, followed by multiple graph capsule layers.
  • Figure 3: The MHA-based decoder. The input to the decoder includes the node embeddings and the context and the output is the computed logits.
  • Figure 4: Structure of the overall policy model consisting of the GCAPCN encoders, Context, MHA-based decoders, Talent bias network, and the custom probability distribution.
  • Figure 5: Talent Pareto front represented by Polynomial Regression applied to computed Pareto solutions obtained by multi-objective optimization of Talents; limits of talents captured with quantile regression.
  • ...and 4 more figures