Table of Contents
Fetching ...

Bayesian Online Learning for Human-assisted Target Localization

Min-Won Seo, Solmaz S. Kia

TL;DR

A novel joint Bayesian learning method to fuse human and autonomous sensor inputs in a manner that the dynamic changes in human detection reliability are also captured and accounted for.

Abstract

We consider a human-assisted autonomy sensor fusion for dynamic target localization in a Bayesian framework. Autonomous sensor-based tracking systems can suffer from observability and target detection failure. Humans possess valuable qualitative information derived from their past knowledge and rapid situational awareness that can give them an advantage over machine perception in many scenarios. To compensate for the shortcomings of an autonomous tracking system, we propose to collect spatial sensing information from human operators who visually monitor the target and can provide target localization information in the form of free sketches encircling the area where the target is located. However, human inputs cannot be taken deterministically and trusted absolutely due to their inherent subjectivity and variability. Our focus in this paper is to construct an adaptive probabilistic model for human-provided inputs where the adaptation terms capture the level of reliability of the human inputs. The next contribution of this paper is a novel joint Bayesian learning method to fuse human and autonomous sensor inputs in a manner that the dynamic changes in human detection reliability are also captured and accounted for. Unlike deep learning frameworks, a unique aspect of this Bayesian modeling framework is its analytical closed-form update equations. This feature provides computational efficiency and allows for online learning from limited data sets. Simulations demonstrate our results, underscoring the value of human-machine collaboration in autonomous systems.

Bayesian Online Learning for Human-assisted Target Localization

TL;DR

A novel joint Bayesian learning method to fuse human and autonomous sensor inputs in a manner that the dynamic changes in human detection reliability are also captured and accounted for.

Abstract

We consider a human-assisted autonomy sensor fusion for dynamic target localization in a Bayesian framework. Autonomous sensor-based tracking systems can suffer from observability and target detection failure. Humans possess valuable qualitative information derived from their past knowledge and rapid situational awareness that can give them an advantage over machine perception in many scenarios. To compensate for the shortcomings of an autonomous tracking system, we propose to collect spatial sensing information from human operators who visually monitor the target and can provide target localization information in the form of free sketches encircling the area where the target is located. However, human inputs cannot be taken deterministically and trusted absolutely due to their inherent subjectivity and variability. Our focus in this paper is to construct an adaptive probabilistic model for human-provided inputs where the adaptation terms capture the level of reliability of the human inputs. The next contribution of this paper is a novel joint Bayesian learning method to fuse human and autonomous sensor inputs in a manner that the dynamic changes in human detection reliability are also captured and accounted for. Unlike deep learning frameworks, a unique aspect of this Bayesian modeling framework is its analytical closed-form update equations. This feature provides computational efficiency and allows for online learning from limited data sets. Simulations demonstrate our results, underscoring the value of human-machine collaboration in autonomous systems.
Paper Structure (7 sections, 4 theorems, 14 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 7 sections, 4 theorems, 14 equations, 7 figures, 1 table, 1 algorithm.

Key Result

Lemma V.1

(HMM based distribution of the target position). Let Assumption assum::cond hold. Consider the finite sample space $\mathbf{x}^i_t \in \mathbf{x}_t$ and $\mathbf{p}^i_t=\mathbf{L}\mathbf{x}^i_t$. Then, $q^i_{t|t}$ of eq::Post_Target_Approx is computed by where with the weight parameters $w_u, w_h \in [0,1] \subseteq {\mathbb{R}}$ satisfying $\sum_{h\in\mathcal{H}} w_h+\sum_{u\in\mathcal{U}} w_u

Figures (7)

  • Figure 1: Examples of the machine learning-based object detection algorithm (YOLO farhadi2018yolov3) failure to detect (a person on the left figures) or miss-detect (a drone on the right figure). (Left) The initial correct detection is compromised due to intense sun glare. (Right) Detection falters due to a complex background and insufficient training data. In both these examples, a human can readily identify the targets, providing valuable assistance to tracking systems.
  • Figure 2: A representative scenario of the problem of interest in this paper: A target, depicted by $\bullet$, moves in a 2D space. UAVs equipped with a stereo vision camera transmit relative measurements to a centralized fusion center, and image data to human operators. The human operators provide inside drawing observations on the image data via a touch screen monitoring system (e.g., tablet). The human and autonomous sensor data fusion is performed to localize the target in the centralized system.
  • Figure 3: $\mathbf{X}_t \in \mathbb{F}_2^{N_p \times 1}$ represents a multivariate random variable at every time step. Each element $s_i$ of $\mathbf{X}_t$ is mapped into the particle $l_i$ in 2D sample space $\mathbf{L} \in \mathbb{R}^{2 \times N_p}$. There is human-drawing observation $\mathbf{O}_t^h \in \mathbb{F}_2^{N_p \times 1}$, indicating the target inside. In this example, two human operators draw the region, where, unlike $\mathbf{X}_t$ that only has a single $s_i = 1$, $\mathbf{O}^h_t$ can have multiple $\zeta_j = 1$.
  • Figure 4: Probabilistic graph model for human-assisted autonomy sensor fusion. $\mathbf{p}_t$ is the position of a target at a given time $t$. Autonomy sensors observations $\mathbf{o}_t^u \in \mathbf{O}_t^u$ are obtained from mobile agents $u \in \mathcal{U}$ at a given time $t$. Human-drawing observations $\mathbf{o}_t^h \in \mathbf{O}_t^h$ are provided by human operators $h \in \mathcal{H}$ at a given time $t$. At time index $t$, the human $h$'s detection reliability for drawing observations is represented as $a^h_t \sim \mathsf{Beta}(\alpha_t^h,\beta_t^h)$. By using different values for $\alpha_t^h$ and $\beta_t^h$, we can capture various levels of reliability.
  • Figure 5: A high level flow diagram of the proposed framework.
  • ...and 2 more figures

Theorems & Definitions (9)

  • Lemma V.1
  • proof
  • Lemma V.2
  • proof
  • Theorem V.1
  • proof
  • Remark V.1
  • Theorem V.2
  • proof