A New Architecture for Neural Enhanced Multiobject Tracking

Shaoxiu Wei; Mingchao Liang; Florian Meyer

A New Architecture for Neural Enhanced Multiobject Tracking

Shaoxiu Wei, Mingchao Liang, Florian Meyer

TL;DR

NEBP+ advances multiobject tracking by integrating a neural network that processes raw LiDAR data with a belief-propagation–based MOT framework. The neural architecture computes affinity and false-alarm cues from motion, size, shape, and BEV features, and fuses them with BP messages to enhance data association and new object initialization. Evaluated on nuScenes LiDAR data, NEBP+ delivers state-of-the-art performance among LiDAR-only trackers, validating the efficacy of a neural-enhanced, factor-graph approach. The work highlights the practicality of exchanging neural messages with classical BP in MOT, offering a flexible path to extend to other sensing modalities and tracking tasks.

Abstract

Multiobject tracking (MOT) is an important task in robotics, autonomous driving, and maritime surveillance. Traditional work on MOT is model-based and aims to establish algorithms in the framework of sequential Bayesian estimation. More recent methods are fully data-driven and rely on the training of neural networks. The two approaches have demonstrated advantages in certain scenarios. In particular, in problems where plenty of labeled data for the training of neural networks is available, data-driven MOT tends to have advantages compared to traditional methods. A natural thought is whether a general and efficient framework can integrate the two approaches. This paper advances a recently introduced hybrid model-based and data-driven method called neural-enhanced belief propagation (NEBP). Compared to existing work on NEBP for MOT, it introduces a novel neural architecture that can improve data association and new object initialization, two critical aspects of MOT. The proposed tracking method is leading the nuScenes LiDAR-only tracking challenge at the time of submission of this paper.

A New Architecture for Neural Enhanced Multiobject Tracking

TL;DR

Abstract

Paper Structure (23 sections, 23 equations, 3 figures, 2 tables)

This paper contains 23 sections, 23 equations, 3 figures, 2 tables.

Introduction
System Model and Problem Formulation
Object States and Transition Model
Data Association and Measurement Model
Object Declaration and State Estimation
The Methodology
Factor Graph and BP for MOT
Prediction
Iterative Probabilistic Data Association
Belief Update
The Neural Architecture
Feature Extraction
The Affinity Coefficient
The False Alarm Rejection Coefficient
Neural Enhanced BP
...and 8 more sections

Figures (3)

Figure 1: Considerd multiobject tracking scenario. LiDAR measurements, ground truth, and tracking results of the proposed method. The orange dashed rectangles indicate the object estimates and the black rectangles indicate ground truth. .
Figure 2: Block diagram of the proposed NEBP+ approach to MOT. The factor graph, BP messages, and message exchange between the factor graph and neural architecture are shown. The factor graph processes the measurements provided by the detector at the current time step and beliefs from the previous time step. The resulting BP messages are passed to the neural architecture. The neural architecture computes the neural messages based on BP messages and a variety of features provided by the pre-trained detector. (More details on the processing performed by the neural architecture are shown in Fig. \ref{['neural network']}.) Finally, the neural messages are passed back to enhance the data association process and track initialization. The following shorthand notation is used: $f^i = f(\underline{\bm{\mathbf{y}}}^i_{k}|{\bm{\mathbf{y}}}^i_{k-1})$, $q^i = q(\underline{\bm{\mathbf{y}}}_k^i, a_k^i; \bm{\mathbf{z}}_k)$, $v^j = v(\overline{\bm{\mathbf{y}}}_k^j, b_k^j; \bm{\mathbf{z}}_k)$, $\phi^{i,j} = \phi^{i,j}(a_k^i, b_k^j)$, $\alpha^i = \alpha_{k}(\underline{\bm{\mathbf{x}}}^i_{k},\underline{r}^i_{k})$, $\varphi^{i,j} = \varphi_k^{i,j,[\ell]}(b_k^j)$, and $\epsilon^{j,i} = \epsilon_k^{j,i,[\ell]}(a_k^i)$ .
Figure 3: Neural architecture of the proposed NEBP+ approach. First, motion, box, shape, and heat features are extracted for each measurement and each PO. Next, an affinity coefficient is computed for each pair of PO and measurement, and a false alarm rejection coefficient is computed for each measurement. As discussed in Section \ref{['sec:neuNet']}, the affinity coefficients are computed based on a linear combination of feature similarity matrixes and BP messages. The false alarm rejection coefficients are calculated based on the box, shape, and heat features extracted for each measurement by the detector .

A New Architecture for Neural Enhanced Multiobject Tracking

TL;DR

Abstract

A New Architecture for Neural Enhanced Multiobject Tracking

Authors

TL;DR

Abstract

Table of Contents

Figures (3)