Table of Contents
Fetching ...

Probabilistically Informed Robot Object Search with Multiple Regions

Matthew Collins, Jared J. Beard, Nicholas Ohi, Yu Gu

TL;DR

We address autonomous search under uncertainty in large, unstructured environments by formulating the problem as a belief MDP with options (BMDP-O) to enable Monte Carlo Tree Search to scale via multi-step region moves and configurable sensor fields of view. The approach includes a lite MDP-O variant that avoids belief updates for real-time operation, and a segmentation of the map into ROI to focus planning on high-probability regions. Experimental results in a 200×200 environment show that ROI-enabled PUCT planners outperform baselines, with Regions Lite offering substantial compute reductions at the cost of more search steps, and multi-cell FOVs improving efficiency overall. The method is extendable to arbitrary FOVs and sensor types, enabling scalable, real-time search in hazardous settings and providing a pathway toward more adaptive, sensor-agnostic robotic search systems.

Abstract

The increasing use of autonomous robot systems in hazardous environments underscores the need for efficient search and rescue operations. Despite significant advancements, existing literature on object search often falls short in overcoming the difficulty of long planning horizons and dealing with sensor limitations, such as noise. This study introduces a novel approach that formulates the search problem as a belief Markov decision processes with options (BMDP-O) to make Monte Carlo tree search (MCTS) a viable tool for overcoming these challenges in large scale environments. The proposed formulation incorporates sequences of actions (options) to move between regions of interest, enabling the algorithm to efficiently scale to large environments. This approach also enables the use of customizable fields of view, for use with multiple types of sensors. Experimental results demonstrate the superiority of this approach in large environments when compared to the problem without options and alternative tools such as receding horizon planners. Given compute time for the proposed formulation is relatively high, a further approximated "lite" formulation is proposed. The lite formulation finds objects in a comparable number of steps with faster computation.

Probabilistically Informed Robot Object Search with Multiple Regions

TL;DR

We address autonomous search under uncertainty in large, unstructured environments by formulating the problem as a belief MDP with options (BMDP-O) to enable Monte Carlo Tree Search to scale via multi-step region moves and configurable sensor fields of view. The approach includes a lite MDP-O variant that avoids belief updates for real-time operation, and a segmentation of the map into ROI to focus planning on high-probability regions. Experimental results in a 200×200 environment show that ROI-enabled PUCT planners outperform baselines, with Regions Lite offering substantial compute reductions at the cost of more search steps, and multi-cell FOVs improving efficiency overall. The method is extendable to arbitrary FOVs and sensor types, enabling scalable, real-time search in hazardous settings and providing a pathway toward more adaptive, sensor-agnostic robotic search systems.

Abstract

The increasing use of autonomous robot systems in hazardous environments underscores the need for efficient search and rescue operations. Despite significant advancements, existing literature on object search often falls short in overcoming the difficulty of long planning horizons and dealing with sensor limitations, such as noise. This study introduces a novel approach that formulates the search problem as a belief Markov decision processes with options (BMDP-O) to make Monte Carlo tree search (MCTS) a viable tool for overcoming these challenges in large scale environments. The proposed formulation incorporates sequences of actions (options) to move between regions of interest, enabling the algorithm to efficiently scale to large environments. This approach also enables the use of customizable fields of view, for use with multiple types of sensors. Experimental results demonstrate the superiority of this approach in large environments when compared to the problem without options and alternative tools such as receding horizon planners. Given compute time for the proposed formulation is relatively high, a further approximated "lite" formulation is proposed. The lite formulation finds objects in a comparable number of steps with faster computation.
Paper Structure (16 sections, 6 equations, 5 figures, 1 table)

This paper contains 16 sections, 6 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: This figure illustrates the planning process for a robot that is searching for an object. The shaded areas represent regions of interest, having a high probability of containing the object. The robot is equipped with the proposed search method that enables it to execute single-cell movements (orange arrows) and travel to regions through a series of actions called options (green arrows).
  • Figure 2: The robot (white circle), within the 2D grid environment. The prior probability of the object (red diamond) being in a given cell is indicated by its color. Dark blue cells represent low probability, and yellow cells represent higher probability, mapped from the colorbar to the right of the figure. Regions of interest are outlined in green.
  • Figure 3: Number of steps taken to find the target object per trial within a 200-by-200 grid environment. Each bar is reported at the 5th, 25th, median, 75th, and 95th percentiles. PUCT consistently outperforms DPS and greedy search. PUCT Regions Lite and PUCT Regions have comparable search times, with most trials doing better than the full model without options.
  • Figure 4: The different fields of view used within simulations. With the robot being in the center cell and facing up, the red cells are observable, and the blue cells are unobservable. From left to right, the FOVs are denoted as point, donut, and forward-facing wide-angle camera observations
  • Figure 5: Number of steps taken to find the target object per trial within a 200-by-200 grid environment. Each bar represents the different fields of view when used with MCTS regions lite. The corresponding fields of view can be seen in Fig. \ref{['fig:fovs']}. With an increasing number of observable cells, search time decreases, though with diminishing returns.