Inverse Risk-sensitive Multi-Robot Task Allocation
Guangyao Shi, Gaurav S. Sukhatme
TL;DR
This work introduces Inverse Risk-sensitive Multi-Robot Task Allocation (IR-MRTA), a framework for adjusting human perceptual risk parameters to align greedy MRTA decisions with human suggestions. It formulates the forward risk-sensitive MRTA with a parameterized behavioral constraint and Linearizes it via a log transform, yielding a knapsack-like constraint and a simple greedy solver based on the score $\frac{r_{ij}}{\beta(-\log p_{ij})^{\alpha}}$. The authors then define the inverse problem to identify new parameters $(\hat{\alpha},\hat{\beta},\hat{\delta})$ that minimally deviate from the current ones while reproducing a given allocation through the greedy procedure, and develop Branch & Bound methods for both ordered and general suggestion cases. They validate IR-MRTA and the BB algorithms on a multi-robot target capture scenario, showing substantial improvements in running time and memory over brute-force enumeration and demonstrating the feasibility of leveraging human input to steer risk-aware task allocations. Overall, the work enables minimal, principled parameter updates to incorporate human preferences in risk-sensitive, combinatorial MRTA without redesigning the optimization framework.
Abstract
We consider a new variant of the multi-robot task allocation problem - Inverse Risk-sensitive Multi-Robot Task Allocation (IR-MRTA). "Forward" MRTA - the process of deciding which robot should perform a task given the reward (cost)-related parameters, is widely studied in the multi-robot literature. In this setting, the reward (cost)-related parameters are assumed to be already known: parameters are first fixed offline by domain experts, followed by coordinating robots online. What if we need these parameters to be adjusted by non-expert human supervisors who oversee the robots during tasks to adapt to new situations? We are interested in the case where the human supervisor's perception of the allocation risk may change and suggest different allocations for robots compared to that from the MRTA algorithm. In such cases, the robots need to change the parameters of the allocation problem based on evolving human preferences. We study such problems through the lens of inverse task allocation, i.e., the process of finding parameters given solutions to the problem. Specifically, we propose a new formulation IR-MRTA, in which we aim to find a new set of parameters of the human behavioral risk model that minimally deviates from the current MRTA parameters and can make a greedy task allocation algorithm allocate robot resources in line with those suggested by humans. We show that even in the simple case such a problem is a non-convex optimization problem. We propose a Branch $\&$ Bound algorithm (BB-IR-MRTA) to solve such problems. In numerical simulations of a case study on multi-robot target capture, we demonstrate how to use BB-IR-MRTA and we show that the proposed algorithm achieves significant advantages in running time and peak memory usage compared to a brute-force baseline.
