Table of Contents
Fetching ...

3D Multi-Object Tracking with Semi-Supervised GRU-Kalman Filter

Xiaoxiang Wang, Jiaxin Liu, Miaojie Feng, Zhaoxing Zhang, Xin Yang

TL;DR

This work proposes a GRU-based MOT method, which introduces a learnable Kalman filter into the motion module, thereby avoiding the need for manual model design and model error and improving the robustness of the model.

Abstract

3D Multi-Object Tracking (MOT), a fundamental component of environmental perception, is essential for intelligent systems like autonomous driving and robotic sensing. Although Tracking-by-Detection frameworks have demonstrated excellent performance in recent years, their application in real-world scenarios faces significant challenges. Object movement in complex environments is often highly nonlinear, while existing methods typically rely on linear approximations of motion. Furthermore, system noise is frequently modeled as a Gaussian distribution, which fails to capture the true complexity of the noise dynamics. These oversimplified modeling assumptions can lead to significant reductions in tracking precision. To address this, we propose a GRU-based MOT method, which introduces a learnable Kalman filter into the motion module. This approach is able to learn object motion characteristics through data-driven learning, thereby avoiding the need for manual model design and model error. At the same time, to avoid abnormal supervision caused by the wrong association between annotations and trajectories, we design a semi-supervised learning strategy to accelerate the convergence speed and improve the robustness of the model. Evaluation experiment on the nuScenes and Argoverse2 datasets demonstrates that our system exhibits superior performance and significant potential compared to traditional TBD methods.

3D Multi-Object Tracking with Semi-Supervised GRU-Kalman Filter

TL;DR

This work proposes a GRU-based MOT method, which introduces a learnable Kalman filter into the motion module, thereby avoiding the need for manual model design and model error and improving the robustness of the model.

Abstract

3D Multi-Object Tracking (MOT), a fundamental component of environmental perception, is essential for intelligent systems like autonomous driving and robotic sensing. Although Tracking-by-Detection frameworks have demonstrated excellent performance in recent years, their application in real-world scenarios faces significant challenges. Object movement in complex environments is often highly nonlinear, while existing methods typically rely on linear approximations of motion. Furthermore, system noise is frequently modeled as a Gaussian distribution, which fails to capture the true complexity of the noise dynamics. These oversimplified modeling assumptions can lead to significant reductions in tracking precision. To address this, we propose a GRU-based MOT method, which introduces a learnable Kalman filter into the motion module. This approach is able to learn object motion characteristics through data-driven learning, thereby avoiding the need for manual model design and model error. At the same time, to avoid abnormal supervision caused by the wrong association between annotations and trajectories, we design a semi-supervised learning strategy to accelerate the convergence speed and improve the robustness of the model. Evaluation experiment on the nuScenes and Argoverse2 datasets demonstrates that our system exhibits superior performance and significant potential compared to traditional TBD methods.

Paper Structure

This paper contains 21 sections, 10 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The pipeline of our proposed method at frame n.$T^D_n$ is the trajectories updated by associating upper observations $D_n$ and using the motion module. Our design focuses on two parts, one is GRU-Kalman Filter: it uses three GRUs to simulate the Kalman filtering process. The second is Semi-Supervised learning, which uses dataset annotations and pseudo-labels generated by a parallel Kalman filter for joint training.
  • Figure 2: EKF Block Diagram.
  • Figure 3: The GRU-Kalman Filter Block Diagram. Here, GRU simulates the loop iteration of process noise $Q_n$, state error covariance $\hat{P}_{n \mid n-1}$ and observation error covariance $\hat{S}_{n \mid n-1}$ by inputting the state difference and observation difference, so as to reason about the Kalman gain $K_n$.
  • Figure 4: Experiments comparing the training convergence speed and accuracy of supervised and semi-supervised training.