Table of Contents
Fetching ...

Onboard Mission Replanning for Adaptive Cooperative Multi-Robot Systems

Elim Kwan, Rehman Qureshi, Liam Fletcher, Colin Laganier, Victoria Nockles, Richard Walters

TL;DR

This work defines the Cooperative Mission Replanning Problem (CMRP), a real-world, edge-deployable variant of the multi-TSP that includes flexible start locations, variable task times, and cooperative tasking. It introduces GATR, a Graph Attention Network–based encoder–decoder model trained with REINFORCE to generate per-agent task allocations in an on-board setting. The approach discretizes tasks into sub-tasks and uses an asymmetric cost structure to enable collaboration and fast computation on edge hardware, achieving near-optimal performance compared with LKH3 while being orders of magnitude faster on a Raspberry Pi. The results demonstrate strong generalization across problem sizes and offer a path toward multi-objective, probabilistic, and anticipatory planning for resilient multi-robot missions.

Abstract

Cooperative autonomous robotic systems have significant potential for executing complex multi-task missions across space, air, ground, and maritime domains. But they commonly operate in remote, dynamic and hazardous environments, requiring rapid in-mission adaptation without reliance on fragile or slow communication links to centralised compute. Fast, on-board replanning algorithms are therefore needed to enhance resilience. Reinforcement Learning shows strong promise for efficiently solving mission planning tasks when formulated as Travelling Salesperson Problems (TSPs), but existing methods: 1) are unsuitable for replanning, where agents do not start at a single location; 2) do not allow cooperation between agents; 3) are unable to model tasks with variable durations; or 4) lack practical considerations for on-board deployment. Here we define the Cooperative Mission Replanning Problem as a novel variant of multiple TSP with adaptations to overcome these issues, and develop a new encoder/decoder-based model using Graph Attention Networks and Attention Models to solve it effectively and efficiently. Using a simple example of cooperative drones, we show our replanner consistently (90% of the time) maintains performance within 10% of the state-of-the-art LKH3 heuristic solver, whilst running 85-370 times faster on a Raspberry Pi. This work paves the way for increased resilience in autonomous multi-agent systems.

Onboard Mission Replanning for Adaptive Cooperative Multi-Robot Systems

TL;DR

This work defines the Cooperative Mission Replanning Problem (CMRP), a real-world, edge-deployable variant of the multi-TSP that includes flexible start locations, variable task times, and cooperative tasking. It introduces GATR, a Graph Attention Network–based encoder–decoder model trained with REINFORCE to generate per-agent task allocations in an on-board setting. The approach discretizes tasks into sub-tasks and uses an asymmetric cost structure to enable collaboration and fast computation on edge hardware, achieving near-optimal performance compared with LKH3 while being orders of magnitude faster on a Raspberry Pi. The results demonstrate strong generalization across problem sizes and offer a path toward multi-objective, probabilistic, and anticipatory planning for resilient multi-robot missions.

Abstract

Cooperative autonomous robotic systems have significant potential for executing complex multi-task missions across space, air, ground, and maritime domains. But they commonly operate in remote, dynamic and hazardous environments, requiring rapid in-mission adaptation without reliance on fragile or slow communication links to centralised compute. Fast, on-board replanning algorithms are therefore needed to enhance resilience. Reinforcement Learning shows strong promise for efficiently solving mission planning tasks when formulated as Travelling Salesperson Problems (TSPs), but existing methods: 1) are unsuitable for replanning, where agents do not start at a single location; 2) do not allow cooperation between agents; 3) are unable to model tasks with variable durations; or 4) lack practical considerations for on-board deployment. Here we define the Cooperative Mission Replanning Problem as a novel variant of multiple TSP with adaptations to overcome these issues, and develop a new encoder/decoder-based model using Graph Attention Networks and Attention Models to solve it effectively and efficiently. Using a simple example of cooperative drones, we show our replanner consistently (90% of the time) maintains performance within 10% of the state-of-the-art LKH3 heuristic solver, whilst running 85-370 times faster on a Raspberry Pi. This work paves the way for increased resilience in autonomous multi-agent systems.

Paper Structure

This paper contains 14 sections, 2 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: (a) A cartoon illustrating the need for in-mission, on-board replanning through three examples that all feature major unforeseen changes and limited ground station communication in dynamic, challenging environments. (b) A cartoon illustrating differences between a more traditional initial planning, non-cooperative problem (left) and our cooperative replanning problem (middle). The right-hand diagram shows how this replanning problem can be formulated in a graph format as a variant of the multiple Travelling Salesperson Problem (mTSP).
  • Figure 2: (a) The RL training process. (b) The transformation of the mission planning data into higher-level node embeddings and a graph embedding using a GAT encoder. (c) The Attention Model decoder generates a probability distribution for available agent-task combinations based on the graph embedding, the current node embedding, the current state of the mission (in embedding space) and the availability mask. The agent-task step with the highest probability is then selected as the next step. (d) The sequential generation of all steps of the mission plan.
  • Figure 3: Comparison of how GATR and other benchmarks (columns) solve an example three-agent-four-task problem for our five increasingly complex problem types (rows). The tours of the three agents are shown in red, amber, and green, the tasks are shown in black circles (with size corresponding to the time cost of each task for the last three rows), and the black square shows the home depot. In second, fourth and fifth rows, where agent start locations differ from the depot, they are marked with a black triangle.
  • Figure 4: (a) Histograms showing the distribution of mission times for our method GATR (top row) and four benchmarks for the test set of 300 scenarios for the three-agent-four-task CMRP. The circled letters represent the five different solutions for the single example scenario shown in the bottom row of Fig. \ref{['fig:illustration']}. (b) A cumulative frequency histogram showing the percentage of GATR solutions (y-axis) with normalized mission times less than a given value (x-axis). The solid lines represent comparison against the Opt-P benchmark ($\hat{mt}_{Opt-P}$) for the five progressively complex problem types illustrated in Fig. \ref{['fig:illustration']}, building up from mTSP (pink) to CMRP, our full problem (dark blue). The dashed dark blue line represents comparison against NOpt-LKH3 benchmark ($\hat{mt}_{NOpt-LKH3}$) for CMRP. (c) A graph illustrating the proportion of solutions with $\hat{mt}$ below the 0.01 (1%) threshold for different problem types and numbers of sub-tasks. The triangles represent the four-task problems and match the colors for the same set of experiments shown in (b).
  • Figure 5: (a) Comparison of average mission times (mean across 30,000 test scenarios) achieved by GATR-General (purple lines) and GATR-Specific (dark blue) across varying numbers of tasks, agents and task discretization levels; both closely track NOpt-LKH3 performance (dashed grey lines) and are much smaller than randomly generated missions (represented by Med-S, solid grey lines). (b) Comparison of percentage increase in mission time of GATR-General (purple) and GATR-Specific (dark blue) relative to LKH3 as a function of increasing problem size (number of sub-tasks per agent). Colored lines and shaded regions show mean and standard deviations respectively, with average across whole dataset indicated by colored text.