Table of Contents
Fetching ...

Observation-Augmented Contextual Multi-Armed Bandits for Robotic Search and Exploration

Shohei Wakayama, Nisar Ahmed

TL;DR

This paper extends contextual multi-armed bandits by introducing observation-augmented CMABs (OA-CMABs) that fuse external semantic observations with context to infer hidden parameters, while accounting for data validity through probabilistic semantic data association (PSDA). It develops a robust Bayesian PSDA measurement update and a generalized expected free energy (EFE) formulation for active inference that handles mixture priors and non-Gaussian likelihoods, enabling robust option selection under uncertain external data. A comprehensive simulation in a deep-space asynchronous search scenario demonstrates reduced cumulative regret and accurate parameter inference even when external observations are faulty. The work provides a practical framework for human-robot collaboration in exploration tasks and advances the integration of PSDA with active inference in bandit-like decision making.

Abstract

We introduce a new variant of contextual multi-armed bandits (CMABs) called observation-augmented CMABs (OA-CMABs) wherein a robot uses extra outcome observations from an external information source, e.g. humans. In OA-CMABs, external observations are a function of context features and thus provide evidence on top of observed option outcomes to infer hidden parameters. However, if external data is error-prone, measures must be taken to preserve the correctness of inference. To this end, we derive a robust Bayesian inference process for OA-CMABs based on recently developed probabilistic semantic data association techniques, which handle complex mixture model parameter priors and hybrid discrete-continuous observation likelihoods for semantic external data sources. To cope with combined uncertainties in OA-CMABs, we also derive a new active inference algorithm for optimal option selection based on approximate expected free energy minimization. This generalizes prior work on CMAB active inference by accounting for faulty observations and non-Gaussian distributions. Results for a simulated deep space search site selection problem show that, even if incorrect semantic observations are provided externally, e.g. by scientists, efficient decision-making and robust parameter inference are still achieved in a wide variety of conditions.

Observation-Augmented Contextual Multi-Armed Bandits for Robotic Search and Exploration

TL;DR

This paper extends contextual multi-armed bandits by introducing observation-augmented CMABs (OA-CMABs) that fuse external semantic observations with context to infer hidden parameters, while accounting for data validity through probabilistic semantic data association (PSDA). It develops a robust Bayesian PSDA measurement update and a generalized expected free energy (EFE) formulation for active inference that handles mixture priors and non-Gaussian likelihoods, enabling robust option selection under uncertain external data. A comprehensive simulation in a deep-space asynchronous search scenario demonstrates reduced cumulative regret and accurate parameter inference even when external observations are faulty. The work provides a practical framework for human-robot collaboration in exploration tasks and advances the integration of PSDA with active inference in bandit-like decision making.

Abstract

We introduce a new variant of contextual multi-armed bandits (CMABs) called observation-augmented CMABs (OA-CMABs) wherein a robot uses extra outcome observations from an external information source, e.g. humans. In OA-CMABs, external observations are a function of context features and thus provide evidence on top of observed option outcomes to infer hidden parameters. However, if external data is error-prone, measures must be taken to preserve the correctness of inference. To this end, we derive a robust Bayesian inference process for OA-CMABs based on recently developed probabilistic semantic data association techniques, which handle complex mixture model parameter priors and hybrid discrete-continuous observation likelihoods for semantic external data sources. To cope with combined uncertainties in OA-CMABs, we also derive a new active inference algorithm for optimal option selection based on approximate expected free energy minimization. This generalizes prior work on CMAB active inference by accounting for faulty observations and non-Gaussian distributions. Results for a simulated deep space search site selection problem show that, even if incorrect semantic observations are provided externally, e.g. by scientists, efficient decision-making and robust parameter inference are still achieved in a wide variety of conditions.
Paper Structure (18 sections, 17 equations, 9 figures, 2 algorithms)

This paper contains 18 sections, 17 equations, 9 figures, 2 algorithms.

Figures (9)

  • Figure 1: Asynchronous collaborative coupled decision-making and state estimation scenario: a deep space robotic lander iteratively selects the best search site (left) and updates the estimates with its own sensor measurements and delayed (possibly erroneous) scientist observations (right).
  • Figure 2: A PGM for OA-CMABs with DA; observable/latent variables are highlighted in yellow/gray. $(\cdot)_{E}$ and $(\cdot)_{I}$ represent external and internal observations/actions. Context vectors $\vec{x}_c$ and $\vec{x}_k$ are summarized as $\vec{x}$. Dotted lines indicate causality of option selection.
  • Figure 3: Asynchronous lander and scientist communication.
  • Figure 4: Cumulative regrets when human semantic observations are always correct, i.e. $FP\!=\!0$.
  • Figure 5: Comparison of typical transition of selected search sites. Black triangles indicate fusion of external outcome observations $o_E$.
  • ...and 4 more figures