Bigraph Matching Weighted with Learnt Incentive Function for Multi-Robot Task Allocation

Steve Paul; Nathan Maurer; Souma Chowdhury

Bigraph Matching Weighted with Learnt Incentive Function for Multi-Robot Task Allocation

Steve Paul, Nathan Maurer, Souma Chowdhury

TL;DR

This work tackles fast near optimal multi-robot task allocation by learning incentives for bipartite graph matching via Graph Reinforcement Learning. It introduces BiG-CAM, a graph neural network framework with Capsule GCAPS encoders and multi-head attention decoders that generate LogNormal edge weight distributions for a bigraph connecting robots and tasks, trained with PPO. BiG-CAM achieves comparable or superior task completion and improved robustness relative to expert heuristic and pure RL baselines, while revealing how learned incentives converge toward expert strategies before diverging during training. The approach combines explainable graph matching with learned heuristics, offering scalable MRTA solutions and suggesting future work on dynamic bigraph sizing and scalable matching.

Abstract

Most real-world Multi-Robot Task Allocation (MRTA) problems require fast and efficient decision-making, which is often achieved using heuristics-aided methods such as genetic algorithms, auction-based methods, and bipartite graph matching methods. These methods often assume a form that lends better explainability compared to an end-to-end (learnt) neural network based policy for MRTA. However, deriving suitable heuristics can be tedious, risky and in some cases impractical if problems are too complex. This raises the question: can these heuristics be learned? To this end, this paper particularly develops a Graph Reinforcement Learning (GRL) framework to learn the heuristics or incentives for a bipartite graph matching approach to MRTA. Specifically a Capsule Attention policy model is used to learn how to weight task/robot pairings (edges) in the bipartite graph that connects the set of tasks to the set of robots. The original capsule attention network architecture is fundamentally modified by adding encoding of robots' state graph, and two Multihead Attention based decoders whose output are used to construct a LogNormal distribution matrix from which positive bigraph weights can be drawn. The performance of this new bigraph matching approach augmented with a GRL-derived incentive is found to be at par with the original bigraph matching approach that used expert-specified heuristics, with the former offering notable robustness benefits. During training, the learned incentive policy is found to get initially closer to the expert-specified incentive and then slightly deviate from its trend.

Bigraph Matching Weighted with Learnt Incentive Function for Multi-Robot Task Allocation

TL;DR

Abstract

Paper Structure (19 sections, 3 equations, 6 figures, 1 table)

This paper contains 19 sections, 3 equations, 6 figures, 1 table.

Introduction
Related Works
MRTA - Collective Transport (MRTA-CT)
MRTA-CT as Optimization Problem
Bipartite Graph for MRTA
MDP over a Graph
Task selection
Incentive (Weight) learning framework
GNN-based feature encoder
Multi-head Attention (MHA) based decoding
BiGraph Weights Modeled as Probability Distributions
Weighted Bigraph Construction
BiG-CAM Policy Training Details
Experimental Evaluation
Baseline Methods:
...and 4 more sections

Figures (6)

Figure 1: Bigraph showing robot-task connections. The Bigraph weights is written as a matrix.
Figure 2: The overall structure of the BiG-CAM policy.
Figure 3: The overall structure of the GCAPS network. Here, $h$ is the embedding length, and bias terms are omitted for ease of representation.
Figure 4: Structure of the MHA-based decoder.
Figure 5: $\%$ task completion for all the methods. Left plots correspond to scenarios with $s_{r}=1$; right plots correspond to scenarios with $s_{r}=2$.
...and 1 more figures

Bigraph Matching Weighted with Learnt Incentive Function for Multi-Robot Task Allocation

TL;DR

Abstract

Bigraph Matching Weighted with Learnt Incentive Function for Multi-Robot Task Allocation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)