Multi-Robot Learning-Informed Task Planning Under Uncertainty

Abhish Khanal; Abhishek Paudel; Hung Pham; Gregory J. Stein

Multi-Robot Learning-Informed Task Planning Under Uncertainty

Abhish Khanal, Abhishek Paudel, Hung Pham, Gregory J. Stein

Abstract

We want a multi-robot team to complete complex tasks in minimum time where the locations of task-relevant objects are not known. Effective task completion requires reasoning over long horizons about the likely locations of task-relevant objects, how individual actions contribute to overall progress, and how to coordinate team efforts. Planning in this setting is extremely challenging: even when task-relevant information is partially known, coordinating which robot performs which action and when is difficult, and uncertainty introduces a multiplicity of possible outcomes for each action, which further complicates long-horizon decision-making and coordination. To address this, we propose a multi-robot planning abstraction that integrates learning to estimate uncertain aspects of the environment with model-based planning for long-horizon coordination. We demonstrate the efficient multi-stage task planning of our approach for 1, 2, and 3 robot teams over competitive baselines in large ProcTHOR household environments. Additionally, we demonstrate the effectiveness of our approach with a team of two LoCoBot mobile robots in real household settings.

Multi-Robot Learning-Informed Task Planning Under Uncertainty

Abstract

Paper Structure (16 sections, 2 equations, 7 figures, 1 algorithm)

This paper contains 16 sections, 2 equations, 7 figures, 1 algorithm.

Introduction
Related Work
Preliminaries: Representing Tasks and Montioring Progress with DFA and SCLTL
Problem Formulation
Methodology
A High-Level Abstraction for Multi-Robot Planning
Planning with our Multi-Robot Abstraction
Learning to Estimate Probabilisitic Action Outcomes
Computing Planning Costs for the Team using PO-UCT
Simulation Experiments
Task Specification Templates
Data Generation and Neural Network Training
Planner Evaluation
Results and Discussion
Real-world Robot Experiments
...and 1 more sections

Figures (7)

Figure 1: Planner Comparison in Home Environment: Two robots are tasked to reach remote and pillow with the exact locations of these objects unknown. A myopic approach searches the nearest locations first, until the task is completed, leading to poor behavior. Our approach utilizes learning with a model-based planning framework to best guide the robots for efficient task completion. See further discussion in \ref{['sec:real-robot-experiments']}.
Figure 2: Overview of our approach: For a joint-action $a_t$ specifying what container each robot should travel towards and interact with to find task-relevant objects, the robot team concurrently travels towards the assigned containers---until one of the robots reaches and searches a container. The outcomes of each joint action transition the robot's belief state based on whether any task-relevant objects were found on the container. The $\mathcal{M}_\varphi$ keeps track of the overall task progress and transitions based on what objects have been interacted with and what objects are still needed to complete the task.
Figure 3: Results: Partially known environments The table shows the average cost accrued in 400 experiments in randomly selected tasks for each planner. We see that our planner improves over the cost over learned and non-learned baselines. Scatter plots shows results for our approach versus learned and non-leared baselines for 1, 2, and 3 robots. The statistics for all the results show benefits of our learning augmented model-based planner.
Figure 4: Simulation experiments in small, medium, and large environments for 3 robots. (a) Task: Interact with a pillow, a tabletopdecor, and a book. (b) Task: Interact with either a dishsponge or a toiletpaper, then interact with a plate, then interact with a creditcard. (c) Task: Interact with a desklamp, a faucet, a plate, and a newspaper. In all these tasks, across different environment sizes, our MR planner improves cost over non-learned and learned baselines.
Figure 5: Robots are tasked with order-dependent tasks with strict temporal constraints: (a) interacting with cellphone then toiletpaper, and (b) interacting with fork then bowl. Robots in our model-based planner (a) wait to interact with the object when searching for remaining objects does not improve task-completion cost, and (b) continue to search for other task-relevant objects when waiting to interact with the found object would increase the task-completion cost. The baselines search for all task-relevant objects and then resolve temporal dependencies of the task.
...and 2 more figures

Multi-Robot Learning-Informed Task Planning Under Uncertainty

Abstract

Multi-Robot Learning-Informed Task Planning Under Uncertainty

Authors

Abstract

Table of Contents

Figures (7)