Risk-aware Meta-level Decision Making for Exploration Under Uncertainty

Joshua Ott; Sung-Kyun Kim; Amanda Bouman; Oriana Peltzer; Mamoru Sobue; Harrison Delecki; Mykel J. Kochenderfer; Joel Burdick; Ali-akbar Agha-mohammadi

Risk-aware Meta-level Decision Making for Exploration Under Uncertainty

Joshua Ott, Sung-Kyun Kim, Amanda Bouman, Oriana Peltzer, Mamoru Sobue, Harrison Delecki, Mykel J. Kochenderfer, Joel Burdick, Ali-akbar Agha-mohammadi

TL;DR

To address exploration under uncertainty, the authors develop a risk-aware meta-level decision-making framework that selects between local and global policies by maximizing $U(b_t; \pi)$ weighted by a probability of successful execution $\hat{P}(\pi)$, where $\hat{P}(\pi) \propto P_G(\pi) P_{W_r}(\pi) P_{kino}(\pi)$. The approach leverages two Information Roadmaps (global and local) and Receding Horizon Planning to enable real-time switching. They validate the method in simulation and on large-scale hardware (LA Subway, Kentucky Underground Mines) and show significant improvements in exploration efficiency, up to about 1.5× area coverage compared with baselines. The work extends hierarchical coverage planning with risk-aware meta-level control, enabling robust, scalable exploration in unknown, hazardous environments.

Abstract

Robotic exploration of unknown environments is fundamentally a problem of decision making under uncertainty where the robot must account for uncertainty in sensor measurements, localization, action execution, as well as many other factors. For large-scale exploration applications, autonomous systems must overcome the challenges of sequentially deciding which areas of the environment are valuable to explore while safely evaluating the risks associated with obstacles and hazardous terrain. In this work, we propose a risk-aware meta-level decision making framework to balance the tradeoffs associated with local and global exploration. Meta-level decision making builds upon classical hierarchical coverage planners by switching between local and global policies with the overall objective of selecting the policy that is most likely to maximize reward in a stochastic environment. We use information about the environment history, traversability risk, and kinodynamic constraints to reason about the probability of successful policy execution to switch between local and global policies. We have validated our solution in both simulation and on a variety of large-scale real world hardware tests. Our results show that by balancing local and global exploration we are able to significantly explore large-scale environments more efficiently.

Risk-aware Meta-level Decision Making for Exploration Under Uncertainty

TL;DR

To address exploration under uncertainty, the authors develop a risk-aware meta-level decision-making framework that selects between local and global policies by maximizing

weighted by a probability of successful execution

, where

. The approach leverages two Information Roadmaps (global and local) and Receding Horizon Planning to enable real-time switching. They validate the method in simulation and on large-scale hardware (LA Subway, Kentucky Underground Mines) and show significant improvements in exploration efficiency, up to about 1.5× area coverage compared with baselines. The work extends hierarchical coverage planning with risk-aware meta-level control, enabling robust, scalable exploration in unknown, hazardous environments.

Abstract

Paper Structure (11 sections, 8 equations, 6 figures, 1 table, 1 algorithm)

This paper contains 11 sections, 8 equations, 6 figures, 1 table, 1 algorithm.

Introduction
Related Work
Problem Formulation
Hierarchical Coverage Planning
Hierarchical Coverage Policy Execution
Risk-aware Meta-level Decision Making
Probability of Successful Policy Execution
Practical Execution Probability Implementation
Results
Conclusion
Acknowledgments

Figures (6)

Figure 1: An example of how meta-level decision making is used to balance the tradeoffs between local and global exploration. Global planning is conducted over the global IRM as shown by the orange spheres (breadcrumbs) connected by green edges. The yellow cubes represent global frontiers. Local planning is conducted over the local IRM (grid represented by gray cubes). In this case, the robot is currently following the global planner to reach the global frontiers shown by the yellow cubes on the left. However, the global frontiers are unreachable (behind the fence), requiring the robot to choose between following the local planner (cyan line) or relocating to a different global goal like those shown in the back right of the image.
Figure 2: Illustration of how meta-level decision making fits into the planning and motion execution pipeline as well as how information is shared amongst these modules kim2021plgrimpeltzer2022figfan2021stepfan2021learning. After solving for the local or global coverage policy, the high-level goals are sent to a low-level motion planner which plans a higher resolution path using $A^*$hart1968formal which is then sent to the kinodynamic planner camacho2013modelfan2021stepfan2020deep.
Figure 3: Example of global to local switching from a real world mission during the DARPA Subterranean Challenge. The first row displays the front view from cameras on the robot. The second row shows the local IRM (discretized grid), local coverage planner path (cyan line), and the direction of the robot (shaded yellow sector). The bottom row shows the global IRM (edges are green, frontiers are yellow, and previous robot locations are brown), the global goal (red arrow), and the direction of the robot (shaded yellow sector). The four columns represent four different time instances.
Figure 4: Example highlighting the motivation for local to global switching from a real world test conducted in the LA Subway. This test was run without meta-level decision making onboard to demonstrate the necessity of using the algorithm. Initially, the robot is in an open space as shown in the center and right images of row 1. The robot then enters a cluttered environment that has high reward (since it has not been covered yet) and very high traversability risk (row 2). Obstacles are shown in black, areas of high traversability risk are shown in red, and white indicates open space. By replaying this data with meta-level decision making running, we can see that at this point the algorithm would recommend switching to a global frontier rather than continuing with the risky local plan; however, since the algorithm was not running, the robot continues to explore the constrained environment. The green and red diamonds indicate the points in time where these events would have occurred if the algorithm had been running in real time on the robot. The robot's local plan is indicated by the cyan line.
Figure 5: Results showing the exploration performance with and without the meta-level decision making algorithm in the simulated subway, maze, and cave environments. The covered area is the average of two runs and the bounds denote maximum and minimum values between the runs.
...and 1 more figures

Risk-aware Meta-level Decision Making for Exploration Under Uncertainty

TL;DR

Abstract

Risk-aware Meta-level Decision Making for Exploration Under Uncertainty

Authors

TL;DR

Abstract

Table of Contents

Figures (6)