Table of Contents
Fetching ...

LiDAR SLAMMOT based on Confidence-guided Data Association

Susu Fang, Hao Li

TL;DR

This work tackles the problem of simultaneous localization, mapping, and moving object tracking in dynamic environments by tightly integrating LiDAR SLAM with confidence-guided data association in a factor-graph backend. It combines LeGO-LOAM odometry, PV-RCNN detections, and a CTRV-based prediction model with a max-mixture GMM data association that leverages prediction and detection confidence to maintain robust associations during occlusions and missed detections. The resulting joint optimization estimates ego-vehicle and object states concurrently and supports asynchronous updates to improve global consistency, achieving real-time performance on standard hardware. Experiments on KITTI Tracking show improved ego-pose accuracy and competitive MOT performance, with notable advantages in challenging scenarios where detections are intermittent or objects are distant.

Abstract

In the field of autonomous driving or robotics, simultaneous localization and mapping (SLAM) and multi-object tracking (MOT) are two fundamental problems and are generally applied separately. Solutions to SLAM and MOT usually rely on certain assumptions, such as the static environment assumption for SLAM and the accurate ego-vehicle pose assumption for MOT. But in complex dynamic environments, it is difficult or even impossible to meet these assumptions. Therefore, the SLAMMOT, i.e., simultaneous localization, mapping, and moving object tracking, integrated system of SLAM and object tracking, has emerged for autonomous vehicles in dynamic environments. However, many conventional SLAMMOT solutions directly perform data association on the predictions and detections for object tracking, but ignore their quality. In practice, inaccurate predictions caused by continuous multi-frame missed detections in temporary occlusion scenarios, may degrade the performance of tracking, thereby affecting SLAMMOT. To address this challenge, this paper presents a LiDAR SLAMMOT based on confidence-guided data association (Conf SLAMMOT) method, which tightly couples the LiDAR SLAM and the confidence-guided data association based multi-object tracking into a graph optimization backend for estimating the state of the ego-vehicle and objects simultaneously. The confidence of prediction and detection are applied in the factor graph-based multi-object tracking for its data association, which not only avoids the performance degradation caused by incorrect initial assignments in some filter-based methods but also handles issues such as continuous missed detection in tracking while also improving the overall performance of SLAMMOT. Various comparative experiments demonstrate the superior advantages of Conf SLAMMOT, especially in scenes with some missed detections.

LiDAR SLAMMOT based on Confidence-guided Data Association

TL;DR

This work tackles the problem of simultaneous localization, mapping, and moving object tracking in dynamic environments by tightly integrating LiDAR SLAM with confidence-guided data association in a factor-graph backend. It combines LeGO-LOAM odometry, PV-RCNN detections, and a CTRV-based prediction model with a max-mixture GMM data association that leverages prediction and detection confidence to maintain robust associations during occlusions and missed detections. The resulting joint optimization estimates ego-vehicle and object states concurrently and supports asynchronous updates to improve global consistency, achieving real-time performance on standard hardware. Experiments on KITTI Tracking show improved ego-pose accuracy and competitive MOT performance, with notable advantages in challenging scenarios where detections are intermittent or objects are distant.

Abstract

In the field of autonomous driving or robotics, simultaneous localization and mapping (SLAM) and multi-object tracking (MOT) are two fundamental problems and are generally applied separately. Solutions to SLAM and MOT usually rely on certain assumptions, such as the static environment assumption for SLAM and the accurate ego-vehicle pose assumption for MOT. But in complex dynamic environments, it is difficult or even impossible to meet these assumptions. Therefore, the SLAMMOT, i.e., simultaneous localization, mapping, and moving object tracking, integrated system of SLAM and object tracking, has emerged for autonomous vehicles in dynamic environments. However, many conventional SLAMMOT solutions directly perform data association on the predictions and detections for object tracking, but ignore their quality. In practice, inaccurate predictions caused by continuous multi-frame missed detections in temporary occlusion scenarios, may degrade the performance of tracking, thereby affecting SLAMMOT. To address this challenge, this paper presents a LiDAR SLAMMOT based on confidence-guided data association (Conf SLAMMOT) method, which tightly couples the LiDAR SLAM and the confidence-guided data association based multi-object tracking into a graph optimization backend for estimating the state of the ego-vehicle and objects simultaneously. The confidence of prediction and detection are applied in the factor graph-based multi-object tracking for its data association, which not only avoids the performance degradation caused by incorrect initial assignments in some filter-based methods but also handles issues such as continuous missed detection in tracking while also improving the overall performance of SLAMMOT. Various comparative experiments demonstrate the superior advantages of Conf SLAMMOT, especially in scenes with some missed detections.

Paper Structure

This paper contains 19 sections, 13 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Results of proposed Conf SLAMMOT solution in KITTI Tracking dataset geiger2012we. (b) and (c) denote the generated complete point cloud map and details of the result shown in (a), respectively. Red and green fully covered bounding boxes denote the ego-vehicle and tracked objects, and red rectangular border denotes detected objects. The green solid line and the dotted lines represent the trajectories of ego-vehicle and tracked objects, respectively.
  • Figure 2: Overall architecture of the presented Conf SLAMMOT solution.
  • Figure 3: The factor graph model in proposed Conf SLAMMOT solution. (a) Joint factor graph optimization backend for coupling and tracking. (b) is an explanation subfigure of confidence-guided implicit data association. (c) is an explanation subfigure of asynchronous object state estimation.
  • Figure 4: Ego-trajectory error maps for different methods in KITTI Tracking dataset. The results for sequence 04, 09, 14 are shown in the first three rows. The gray dashed line denotes the ground truth.
  • Figure 5: The visualization results of different methods in continuous missed detection scenes. (1a-1d), (2a-2d), (3a-3d) represent the visualization results of LIO-SEGMOT at consecutive moments on sequences 00, 02, and 03, respectively; (1a*-1d*), (2a*-2d*), (3a*-3d*) represent the visualization results of the proposed Conf SLAMMOT at consecutive moments on sequences 00, 02, and 03, respectively. Red and green fully covered bounding boxes denote the ego-vehicle and tracked objects with different IDs, and red rectangular border denotes detected objects. Due to temporary occlusion or distance of objects, there are continuous frame missed detection; these objects are highlighted and enlarged with circles for display. For example, object ID 0 in (1b*-1c*), object ID 11 in (2b*-2c*), and object ID 0 in (3b*-3c*), and as shown in (1d*), (2d*), and (3d*), Conf SLAMMOT can accurately resume tracking without ID switching. In sequence 02, the reason for the differing initial object IDs between the two methods is that the vehicle has already traveled a longer distance, with more moving objects and random ID assignments. In contrast, sequences 00 and 03 represent scenarios where the vehicle has just started moving.